Next Article in Journal
The Critical Role of Small-Scale Dissipation in Deriving Subgrid Forcing Within an Ocean Quasi-Geostrophic Model
Next Article in Special Issue
Consistent Markov Edge Processes and Random Graphs
Previous Article in Journal
An Algorithm for Finding Approximate Symbolic Pole/Zero Expressions
Previous Article in Special Issue
PathGen-LLM: A Large Language Model for Dynamic Path Generation in Complex Transportation Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling the Evolution of AI Identity Using Structural Features and Temporal Role Dynamics in Complex Networks

by
Yahui Lu
1,2,*,
Raihanah Mhod Mydin
1 and
Ravichandran Vengadasamy
1
1
Faculty of Social Sciences and Humanities, Universiti Kebangsaan, Bangi 43600, Malaysia
2
College of Foreign Languages, Hunan Institute of Engineering, Xiangtan 411104, China
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(20), 3315; https://doi.org/10.3390/math13203315
Submission received: 4 September 2025 / Revised: 9 October 2025 / Accepted: 15 October 2025 / Published: 17 October 2025
(This article belongs to the Special Issue Modeling and Data Analysis of Complex Networks)

Abstract

In increasingly networked environments, artificial agents are required to operate not with fixed roles but with identities that adapt, evolve, and emerge through interaction. Traditional identity modeling approaches, whether symbolic or statistical, fail to capture this dynamic, relational nature. This paper proposes a network-based framework for constructing and analyzing AI identity by modeling interaction, representation, and emergence within complex networks. The goal is to uncover how agent identity can be inferred and explained through structural roles, temporal behaviors, and community dynamics. The approach begins by transforming raw data from three benchmark domain, Reddit, the Interaction Network dataset, and AMine, into temporal interaction graphs. These graphs are structurally enriched via motif extraction, centrality scoring, and community detection. Graph Neural Networks (GNNs), including GCNs, GATs, and GraphSAGE, are applied to learn identity embeddings across time slices. Extensive evaluations include identity coherence, role classification accuracy, and temporal embedding consistency. Ablation studies assess the contribution of motif and temporal layers. The proposed model achieves strong performance across all metrics. On the AMiner dataset, identity coherence reaches 0.854, with a role classification accuracy of 80.2%. GAT demonstrates the highest temporal consistency and resilience to noise. Role trajectories and motif patterns confirm the emergence of stable and transient identities over time. The results validate the fact that the framework is not only associated with healthy quantitative performance but also offers information on behavioral development. The model will be expanded with semantic representations and be more concerned with ethical considerations, such as privacy, fairness, and transparency, to make identity modeling in artificial intelligence systems responsible and trustworthy.

1. Introduction

Background and Motivation

The trend of artificial intelligence (AI) systems is rapidly heading toward an active, interacting participant in the world, and even to the stage of entering into a dynamic, social world. This step presents new theoretical and practical challenges concerning the concept of identity in AI, specifically how artificial agents learn to establish and maintain a recognizable and consistent behavioral pattern over time and varying circumstances. Human systems possess an identity rather than forming one, and it evolves as a result of continuous interactions, shifting roles, and feedback within society. This phenomenon is also beginning to occur with AI systems, particularly those integrated into online systems, multi-agent systems, or areas of human–machine cooperation. However, most identity modeling in AI would still take the form of fixed features or latent embeddings, providing little evidence of how identity changes structurally or in behavior [1]. Recent work on complex networks provides a possible direction. The emergence of global behavior due to local relations and structural dynamics has been used to model networks in neuroscience, sociology, and computational biology [2,3]. Applying these ideas to AI, we can think of identity not as a label, but as a characteristic —a consequence of the position, influence, and development of an agent within a network of interactions. The scaling of such interactions can also be addressed by the emergence of graph-based machine learning, which is particularly growing with Graph Neural Networks (GNNs). GNNs were also found to be applicable in learning graph structural representations, role identification, and modeling large graph dynamics [4]. For example, role-aware GNNs were observed to discriminate between similar agents in functionality but topologically different in communication and collaboration networks [5]. Despite these trends, there are no models of identity that address identity as a temporal, emergent, and explainable process. In the majority of systems, identity does not contribute to the structural development, which shapes the behavior of agents within the system as it evolves [6]. This gap limits the malleability, monitoring, and understanding of AI agents in practical activities. Our contribution as an antidote to this weakness is to propose a network-centric framework for constructing and processing AI identity, focusing on interaction, representation learning, and role emergence within complex networks.
Fixed characteristics or inactive representations cannot limit the study of identity in artificial systems. Because AI agents are likely to operate in dynamic, multi-agent, or human contexts, their identity should be able to accommodate interaction, adaptation, and structure over time. The theory of complex networks provides a mathematical framework for modeling such behaviors, allowing it to describe relationship patterns and emergent roles that define agents’ actions and perceptions. A complex network is not only a graph of connections but also a network of significance coded into the connectivity pattern, flow, and evolution. The networks have been utilized in the social systems to explain influence, group dynamics, and identity roles [7]. This can be applied in the case of AI systems, where agents may temporarily form coalitions, change roles, and alter their structural positions within a system of communication or collaboration. The recent developments have demonstrated that network dynamics might be more effective in explaining and predicting AI behavior than specific features [8]. To illustrate, temporal interaction networks have helped us understand how the roles developed by the agents extend beyond individual activities or sessions [9]. This type of dynamic graph conceals an individual’s identity by synthesizing internal conditions and extrapolating identity beyond the individual based on the external design of interactions, connections, and roles. It is achieved by incorporating AI systems into dynamic graphs. Graph Neural Networks (GNNs) are also strong in this framework’s graph-based method. Unlike conventional embedding frameworks, GNNs can learn context-sensitive and role-sensitive representations that align with the dynamic role of the agent within the graph [10]. The more recent additions are also higher-order structures, motifs, communities, and temporal sequences, and, overall, they are closer to the multi-layered identity [11]. Together with structural correctness, networks provide the comprehension of interpretable AI. Role-based modeling and community detection enable us to track an agent and understand why and how it occupies a specific position within a system. This is especially necessary when it comes to multi-agent systems, recommenders, and self-governing decision-making systems, where stability of behavior and accountability are the primary concerns [12]. The perspective of the AI identity that we adopt in this paper is network-oriented. We are hopeful that AI identities can be built and interpreted as dynamic compositions that can be constructed and visualized in terms of a graph of interactions, temporal role modeling, and GNN-based learning, rather than being encoded as properties. This study explores the intersection of identity modeling and network science in artificial intelligence systems. We pose the following four research questions:
  • How can AI identity be modeled as an emergent phenomenon in dynamic interaction networks?
  • What structural and temporal patterns in networks contribute to the formation and coherence of AI identity?
  • How can graph-based representations of identity be interpreted and rationalized?
  • How well does the proposed network-based identity framework perform on real-world benchmark datasets?
Although the presented framework incorporates proven elements, such as graph neural networks (GNNs), motif extraction, community detection, and the time model, it is not merely a sum of existing solutions. The Identity Evolution Modeling Framework (IEMF) proposes a unified paradigm that models structural dependencies, temporal transitions, and motif semantics at higher orders to understand the emergence and development of AI identities in dynamical interaction networks. In particular, the motif-based message relaying enhances the GNN representations with local relational patterns, whereas the temporal coherence regularizer ensures the cross-temporal stability of the identity trajectories. Moreover, it proposes new measures of identity coherence and identity consistency to measure the degree of persistence and coherence of a latent identity change in an agent over time. Together, these components move existing temporal-GNN models towards interpretability and longitudinal identity tracking, being directly in the network learning process, thus making a unique methodological contribution not inherent in the model combinations. The research also aims to validate the framework using real-world datasets, such as Reddit and AMiner, demonstrating its ability to quantify identity coherence, trace role transitions, and uncover emergent identity phenomena. Ultimately, this work aims to provide a foundational approach to representing AI identity not as a fixed attribute, but as a relational and evolving construct within complex systems. Aligned with the above research questions, this paper makes the following four contributions:
  • We propose a graph-based framework that defines AI identity as an emergent property of dynamic, role-driven network interactions.
  • We introduce a method for capturing structural and temporal patterns using graph neural networks and motif-based role tracking.
  • We develop a rationalization pipeline that interprets identity through graph features such as centrality, entropy, and community evolution.
  • We empirically validate our approach using two benchmark datasets—Reddit and AMiner, demonstrating its effectiveness in capturing coherent, interpretable identity trajectories.
The paper is organized as follows. Section 2 reviews the theoretical foundations of AI identity and complex network analysis, as well as related work. Section 3 outlines the proposed methodology, which includes identity modeling, network construction, representation learning, and emergence modeling. Section 4 outlines the experimental setup and datasets used. quantitative results, ablation studies, and emergent patterns of identity. Section 5 concludes with a summary and directions for future research.

2. Related Work

Traditionally, identity within artificial intelligence systems has been managed as a collection of fixed attributes, often encoded in predetermined roles, pre-defined properties, or vector embeddings, which have been trained on supervised actions. These kinds of representations tend to express the latent traits or categories associated with an agent, rather than how behavior, situation, or interaction may vary over time [13]. The question of such restraint is even more pertinent in the case of AI agent execution in an interactive, adaptive, and socially situated environment, i.e., multi-agent systems, human–AI dialog, or self-organizing group formation. Agents would tend to change roles in the given environment, respond to the changing inputs, and form cycles of action that would be human-like in terms of their identity [14]. Recent research has indicated that the perspective on AI identity needs to be altered towards emergence, a point of view in which identity is not programmed but arises due to continual interaction and feedback with the environment [15]. Identity, in this view, may be treated as a dynamic process characterized by the acquisition of roles, consistency in behavior, and long-term consistency across different tasks or situations. The new view is indicative of the insight of identity in behavioral and cognitive sciences, where identity is secured through repetition, recognition, and enactment of roles in social contexts [8]. With such identity modeling applied in the context of AI system cases, an alternative route of designing interpretable, adaptive, and ethically responsible agents is established. Complex networks have been used to present an enriched conceptual and mathematical model of systems at a deeper level, one that is interconnected with emergent dynamics. They have been frequently applied to explore the origin of global processes as a locally based product in fields such as neuroscience, epidemiology, and social science [16]. The application of complex networks has also gained popularity in capturing the interaction among agents, knowledge diffusion, and decision-making infrastructure in the field of AI. The graph models also enable the researcher to examine attributes of topology, namely centrality, modularity, and influence, which are directly relevant to identity and functional change over time in a networked environment [17]. The agents may, in a specific example, be represented as nodes in a dynamic graph, and the edges may reflect communications, trust, or dependency on the task in multi-agent systems, collaborative learning, etc. The structures are transformed, indicating the transforming roles, importance, and power of each agent in the system. As agents interact, they may potentially transform their roles in the network, and new combinations of roles, structural alignments, or emergent hierarchies can emerge [18]. The representation of agents with both structural and temporal characteristics has been learned through graph neural networks (GNNs) and temporal graph models. The models can provide identity tracking, role segregation, and predict behaviors in many applications, including recommendation systems or autonomous navigation [19]. The complex network theory may, therefore, be considered both a modeling theory and a conceptual framework for modeling AI identity as a process. The process is founded on structure, dynamics, and emergence. GNNs are iterative procedures that combine the information collected by a node residing in its local neighborhood, enabling the model to understand structural and semantic regularities [1]. Alternative approaches include Graph Convolutional Networks (GCNs), Graph Attention Networks (GATs), and GraphSAGE, which offer different methods for balancing local information and global effects. The system identity in time-varying systems is time-dependent and has also been proposed to offer time-dependent mechanisms to time-varying systems. The models can track the change in structure and attributes of nodes over interaction episodes, and hence enable them to model identity as a dynamic process [20]. Role-aware and motif-based GNNs are also a recent research that generalizes the neighborhood similarity to learn functional behavior patterns. Higher-order structures can also represent emergent roles, such as leaders, connectors, or gatekeepers, which are not identifiable based solely on node attributes [21]. The representations are also powerful in terms of identity modeling as they can not only define what an agent is, but also how and where it fits in the network structure. The insertion of these patterns over time elicits identity trajectories that are consistent, adaptive, and reflect the way identity is constructed and transformed over time, as it unfolds within the system [22]. Additionally, interpretability has also been a topic of increasing attention in recent years. Some approaches, such as visualizing attention and embodied disentanglement, trajectory mapping, etc., enable making identity modeling more visible and explainable, which is a significant issue in practical applications related to social computing, recommendations, and autonomous decision-making [23,24]. The existing literature in AI identity is broadly categorized into three streams: symbolic modeling, latent embeddings, and graph-based approaches. The postulations of identity tend to be fixed roles/rules of logic through symbolic definitions. These types of systems can be determined, but they are rigid and not easily scaled up in dynamic environments. To exemplify, rule-based agent architectures are capable of defining agent types or objectives explicitly, but they are unable to maintain the evolving social behavior or consistency over more extended periods. Latent embedding models will be inclined to exploit a dense identity representation learned through interaction histories in natural language processing and recommendation systems. Such models are pretty strong, but are often difficult to interpret and lack contextualization to social structure. Moreover, they are unable to reflect emergence because the identity is not learned during the dynamic interaction, but rather it is pre-imposed. Graph representations make a compelling tradeoff because they use identity in relational and temporal graphs. The former work did user profiling or role classification of the user on a static graph, and the latter has turned to temporal and inductive GNN to capture the dynamic behavior [25]. The performance of these methods has been proven to be good in terms of role detection, fraud detection, tracking communities, and modeling agent behavior. There have been some studies that have begun to address identity emergence directly. To illustrate, identities have been considered as structural role trajectories in some cases, while others have been explored in terms of functions related to motifs that correlate with social or task-related activity. However, most of these works remain specific, either being single-task, static, or with low interpretability. To the best of our knowledge, few researchers integrate interaction, representation, and emergence into a unified graph-based formalism intended to comprehend AI identity. This is the gap that we have been bridging, introducing a holistic perspective of identity that evolves, is structural, and is facilitated by interpretable representation learning.

3. Methodology: Network-Based Modeling of Identity

To reflect the dynamic and emergent characteristics of AI identity, the proposed work suggests an orderly, network-oriented modelling framework that incorporates temporal graph construction, structural encoding, and representation learning using Graph Neural Networks (GNNs). The general principle here is to tie agents to nodes in changing interaction graphs, such that identity is constructed through roles, behaviors, and structural positioning in time. With the addition of motifs, centrality measurements, and community structures to these graphs, and the temporal GNNs embedded within this framework, the framework produces interpretable trajectories of identity. This approach enables the identification of stable and transitional roles, revealing how identity patterns are created as a result of social interaction and sequence. The modeling process consists of the following stages, which are described below. The given sequence will be used in conjunction with role stability, identity transitions, and structural influence, enabling interpretable and data-driven knowledge about the AI’s identity. The entire methodological pipeline used to construct and analyze identity through complex networks is shown in Figure 1. It begins with raw data on interactions, such as Reddit comment threads or author metadata, which is converted into a Temporal Interaction Graph of agents and temporally ordered interactions. They are encoded structurally to retrieve meaningful patterns such as motifs, centrality scores, and community structure, which give good context for the positioning of the agents. These structurally enhanced graphs are fed into Graph Neural Networks (GNNs), e.g., GCNs or GATs, to compute dynamic identity embeddings. Lastly, these embeddings are used to build identity trajectories that can be classified into roles, interpreted based on motifs, and studied in terms of identity emergence over time. The pipeline encompasses a relational structure and time evolution, providing a sound method for modeling the AI identity in dynamic multi-agent systems.

3.1. Identity Construction Framework

In this framework, AI identity is a transitory, emergent network that is constructed as a result of planned interactions in a network. Instead of labeling or embedding identities, an identity is dynamically built for an agent based on its position, behavior, and changing role within a dynamic graph. Raw interpersonal data, e.g., the responses that a user posts on Reddit [26] or co-authorship in AMiner [27]. Project this data into a temporal interaction graph, based on nodes and edges, incorporating metadata such as timestamps, frequency, or other relevant characteristics. The development of identity is measured by dividing the graph into segments of time. Position all agents within a structural and a relational setting utilizing graph neural networks (GNNs). These embeddings include, e.g., local structure (degree, ego-network topology) and global information (community, network centrality). Identity modeling needs to be complemented with motif analysis to find recurring substructures of interactions, as well as community detection to find group cohesion over time. Consider identity to be a history of embeddings, one at every time slice, representing the agent’s history as it passes through roles and positions within the network. In this study, AI identity refers to a temporally consistent latent representation that encapsulates an agent’s evolving relational state within a dynamic interaction network. Formally, let G t = ( V , E t , X t ) denote the temporal graph at time t , where V is the set of agents, E t the set of interactions, and X t their feature matrices. For each agent v V , its identity trajectory is expressed as
I v = { z v ( t ) } t T , z v ( t ) = f θ ( G t , v ) R d
where f θ denotes the graph-based encoding function. This trajectory captures how an agent’s structural and contextual embedding evolves over time. In contrast, a role represents a discrete functional state derived from identity embeddings through a mapping r v ( t ) = ψ ( z v ( t ) ) , while behavioral patterns denote the observable manifestations of actions or interactions b v ( t ) within the network. Thus, identity characterizes internal relational dynamics, roles describe categorical states inferred from identity, and behavior refers to external expressions observable in the data. This definition establishes conceptual and mathematical clarity for all subsequent analyses in the paper.
Algorithm 1 formalizes the key stages of our framework—from temporal graph construction and structural encoding to GNN-based identity embedding and role trajectory inference. It supports both static and dynamic role modeling across diverse networked environments.
Algorithm 1: Identity Construction and Representation via Temporal Graph Modeling
Input:
D : Raw interaction data (e.g., comments, authorship logs)
T : Total number of time slices
f m o t i f : Motif extraction function
f G N N : Graph Neural Network model (e.g., GCNs, GATs, and GraphSAGE)
Output:
Z v t : Identity embeddings for node vvv at each time t
R v : Role trajectory for each node
Step 1: Temporal Graph Construction
for   t = 1 to T :
  • Extract interactions D t D in time slice t
  • Build graph G t = ( V t , E t ) where
  • V t : agents (nodes) active in time t
  • E t : edges formed via interaction (e.g., reply and co-author)
Step 2: Structural Feature Encoding
for each graph G t :
  • Compute local structural features (degree, centrality)
  • Apply f m o t i f ( G t ) to extract motif-based patterns
  • Detect communities via modularity optimization (e.g., Louvain)
Step 3: GNN-Based Representation Learning
for each graph G t :
  • Initialize node features with structural encodings
  • Compute embeddings:
  • Z t = f G N N ( G t )
  • Store identity embedding for each node v :
  • Z v t Z t [ v ]
Step 4: Identity Trajectory and Role Inference
for each node v t V t :
  • Concatenate embeddings across time:
  • T v = [ Z v 1 , , Z v T ]
  • Apply clustering/classification to derive roles R v
  • Track role transitions, persistence, and entropy
Return:  Z v t and R v for all nodes

3.2. Network Construction from Data

Constructing a meaningful network from raw interaction data involves defining the core structure: nodes, edges, and time-evolving relations, in a way that reflects the social, functional, or collaborative behavior underlying the data.
Let
  • 𝒱 denote the set of all agents (nodes),
  • E t the set of edges active at time ttt, and
  • G t = ( 𝒱 , E t ) a snapshot of the interaction graph at time t
The identity graph is constructed as a temporal multiplex network
G = { G t | t = 1 , , T ; = 1 , , L }
where each layer represents a different type of interaction (e.g., reply, co-authorship), and T denotes the number of time intervals.

3.2.1. Node and Edge Semantics

Each node v i 𝒱  corresponds to an agent, enriched with a feature vector
X i = ϕ ( v i )
where ϕ : 𝒱 R d maps raw metadata (e.g., user profile, activity stats) into a d-dimensional embedding space.
Each edge e i j t E t captures an interaction of type between agents v i and v i during the time window t , and is assigned an attribute vector.
a i j t = ψ ( e i j t )
With ψ : 𝒱 R d , encoding features such as interaction frequency, sentiment, or context category.
A graph snapshot G t = ( 𝒱 , E t ) encodes all pairwise interactions of type occurring in time interval t . The full dataset is represented as a sequence of such layer-specific snapshots.
Each graph G t is attributed, with nodes v i carrying features X i , and edges e i j t carrying attributes a i j t .

3.2.2. Temporal and Multiplex Modeling

To model identity as a time-evolving process, define a temporal graph series G 1 ,   G 2 . , G T , where G t = = 1 L G t aggregates all interaction layers at time t . This setup captures both temporal evolution and interaction heterogeneity.
Let
  • X t R | 𝒱 | × d be the matrix of node features at time t , and
  • A t R | 𝒱 | × | 𝒱 | be the adjacency matrix of layer ℓ at time t.
Define the temporal multiplex graph as:
G = ( X t , A t = 1 L t = 1 T
For any node v i , define an identity path as a time-indexed sequence
Γ i = X i 1 , X i 2 , , X i T
where X i t is the node’s representation at time t , to be learned by downstream temporal GNNs. The sequence Γ i encodes the evolving structural context and interaction history that defines identity.
Let Γ i an Γ j be identity paths of agents v i and v j . If
t = 1 T X i 1 X j 1 2 ϵ
For small ϵ > 0, then v i and v j are said to exhibit role-equivalent identity emergence.
This identity path formulation supports clustering, classification, and visualization of emerging roles, allowing the inference of coherent identity patterns over time.

3.3. Representation Learning via GNNs

Representation learning plays a central role in capturing the structural, temporal, and contextual aspects of an agent’s identity in dynamic networks. This section formalizes the embedding design and details the graph neural architectures applied for learning identity representations.

3.3.1. Identity Embedding Design

Given a time-indexed multiplex graph G t = 𝒱 , A t L , and node feature matrix X t R | 𝒱 | × d , the objective is to learn an embedding function:
f t : 𝒱 R h
Such that each node v i at time t is represented by:
z i t = f t ( v i ) R h
These embeddings must encode both:
  • Local structure: derived from the k h o p neighborhood N k ( i ) , and
  • Global patterns: derived from community roles, motif presence, and position in the graph.
Formally, the node embedding at time t is:
z i t = G N N 0 X i t , N i ; A t = 1 L
where G N N 0 (⋅) is the neural architecture parameterized by θ , and X i t is the feature vector for node v i .
To encode temporal identity, define an identity trajectory as a sequence:
Γ i = z i 1 ,   z i 2 , z i T
A temporal similarity score between two identity trajectories is computed using:
S i m Γ i ,   Γ j = 1 T t = 1 T c o s ( z i t ,   z j t )
where c o s ( , ) is cosine similarity. Clustering identities with high mutual similarity helps reveal co-evolving roles or identity convergence.

3.3.2. GNN Architectures Used

Three distinct GNN architectures are employed to evaluate different structural biases and inductive capabilities for identity modelling. GCNs perform spectral filtering using the normalized graph Laplacian. Each layer update follows:
H ( k ) = σ D ~ 1 / 2 A ~ D ~ 1 / 2 H ( k 1 ) W ( k )
where
  • A ~ = A + I adds self-loops,
  • D ~ is the degree matrix of A ~ ,
  • W ( k ) is the learnable weight matrix for the k t h layer,
  • σ ( ) is an activation function (ReLU).
GCNs aggregate features from immediate neighbors with equal weight (after normalization), which favors homophilous graphs.
GATs use self-attention to compute dynamic weights for each neighbor:
a i j ( k ) = e x p L e a k y R e L U ( a [ W H i k 1 | W H j k 1 ) k N ( i ) e x p L e a k y R e L U ( a [ W H i k 1 | W H k k 1 )
Then, embeddings are updated as:
h i ( k ) = σ j N ( i ) a i j ( k ) · W H j k 1
The attention mechanism enables adaptive weighting of neighbor contributions, capturing role asymmetries and context dependencies.
GraphSAGE performs inductive node embedding via sampling and aggregation:
h i ( k ) = σ W ( k ) · A G G R E G A T E ( k ) h j ( k ) , j N ( i ) h i ( k 1 )
Common aggregation functions include:
  • Mean aggregation:
A G G R E G A T E = 1 | N i | j N ( i ) h j
  • Pooling or LSTM-based functions for higher expressiveness.
GraphSAGE supports inductive generalization to unseen nodes and dynamic graph segments.
Each model produces embeddings z i t for node v i at time t . These embeddings form the temporal identity trajectory Γ i .

3.4. Modeling Emergence and Role Dynamics

This section explores how identity emerges and evolves through the structural dynamics of the graph. Rather than assuming fixed roles, identity is inferred from observed patterns—such as communities, motifs, and role transitions, captured over time in interaction graphs.

3.4.1. Community Detection and Identity Modules

Communities reflect clusters of agents with dense intra-connectivity. Detecting such groups helps reveal identity modules, or the coherent social or functional roles that agents adopt within the network. For each graph snapshot G t , apply modularity-based algorithms (e.g., Louvain or Leiden) to assign community labels { c i t } for all v i .
Define the community persistence score for node v i :
P i = 1 T 1 t = 1 T 1 l c i t = c i t + 1
where l (⋅) is the indicator function. Higher P i implies stable group identity.

3.4.2. Motif Analysis and Role Structures

Motifs—frequent subgraph patterns, capture functional roles beyond node degree or centrality. Common motifs include triads, stars, and cliques. For each node, compute motif participation vectors:
m i = [ m 1 , m 2 , m r ]
where m k is the count of node v i appearing in motif type k . These vectors are used to distinguish structural roles, such as bridges, hubs, or isolates.
Motif participation is integrated with identity embeddings to enrich representation:
z ~ i t = z i t m i t
where denotes concatenation.

3.4.3. Temporal Role Transitions

To capture identity emergence over time, analyze role transitions across time steps. Define a role mapping function ρ : z i t r i t , where r i t is a discrete role label from clustering or classification. Construct a role transition matrix:
T a b = 1 ( 𝒱 | ( T 1 ) i = 1 | 𝒱 | t = 1 T 1 l r i t = a r i t + 1 = b
This matrix reveals dominant identity shifts (e.g., learner → contributor → leader). The emergence of stable or cyclic patterns reflects identity consolidation or transformation.

4. Experimental Validation

4.1. Benchmark Datasets

The proposed framework is assessed using two benchmark datasets: the Reddit Interaction Network and the AMiner Co-authorship Network. The origin of these datasets is quite different: online conversations and academic collaborations, which can be compared regarding the identity dynamics in both short-term and long-term interaction conditions.
The Reddit Interaction Network is a dataset of discussion data sourced from open subreddits, where members engage in threaded discussions. The data spans six months of activity, broken down into 24 time slices of a week. The attributes of the edges are the number of responses and the extent of discussion depth in a thread. In contrast, the nodes represent the number of users, their activity level, and the identifiers of the subreddits. The data are highly dynamic interactions and identity displays that are suitable for exploring the sense of short-term role development and change in decentralized digital conversations.
The AMiner Co-authorship Network is created based on a set of scholarly articles within the AMiner index. The nodes can be taken to represent the researchers, and the undirected edges can be interpreted as co-authorship on published papers. The time horizon will span 30 years, from 1990 to 2020, with 31 years as the annual time units. The properties of nodes include the number of publications, disciplinary breadth, and diversity of fields. To assess the externalizability of the proposed framework, the GitHub Interaction Network data [28] was incorporated. This dataset represents the temporal cooperation of developers in the form of pull requests, comments on issues, and contributions to a repository over a six-month period. The nodes represent all the developers, and the edges represent the dynamic interactions that occur between them, arranged into 24 biweekly snapshots. Once preprocessed, the dataset has about 8200 nodes and 67,000 temporal edges. In comparison to Reddit and AMiner, GitHub is characterized by a middle-range temporal dynamic and organized professional relationships, providing a unique sphere in which the framework can be tested regarding its fitness and strength. These data sets are converted into time graphs in Table 1, with labeled edges and nodes that convey the meaning of interactions, variety of content, and dynamic patterns required for identity modeling.

4.2. Experimental Setup and Results

This section outlines the data preparation steps and model training configurations used in validating the proposed identity modeling framework. The experiments are designed to ensure consistency across both datasets while preserving their respective structural and temporal characteristics.

4.2.1. Data Preprocessing Pipeline

Raw interaction logs from Reddit and AMiner were converted into time-sliced graphs to understand the structure. In the case of Reddit, the dataset was subdivided into 24 temporal snapshots by choosing weekly intervals. The reply links were detected in each comment, and directed edges were built between users. The attributes of edges, like frequency, depth, and interaction order, were also assigned with the help of mentions and thread hierarchies. Node features were embedded as the number of comments and participation in subreddits at the user level. In the AMiner dataset, co-authorship records were aggregated by publication year, resulting in 31 snapshots of the graph spanning the period from 1990 to 2020. Metadata of the authors, such as the total number of publications, field entropy, and venue level, were incorporated into the node feature vectors. In every paper, undirected edges were created among all the co-authors, and the weight of each edge was calculated based on the joint publishing frequency. Singleton nodes and disconnected components did not have to be above a certain threshold to be filtered. All graphs of both datasets were scaled based on degree, and the node features were normalized using z-score normalization. The final graph format for each time step includes a node feature matrix X t , adjacency matrices A t = 1 L , and optional motif counts and community labels for auxiliary tasks.

4.2.2. Training Configurations

All models were trained using PyTorch Geometric version 2.4.0 with PyTorch 2.2.1 as the backend, with uniform architectural parameters across both datasets for fairness. The embedding dimension was fixed at h = 128 , and GNNs were trained with two message-passing layers. For temporal consistency, models were trained independently on each graph snapshot using shared weights across time steps. The training objective minimized a node classification loss using community labels and a reconstruction loss for edge prediction. Optimization was performed using the Adam optimizer with a learning rate of 0.001, weight decay of 5 × 10 4 , and a batch size of 512. Early stopping was applied based on validation loss with a patience of 10 epochs. Each experiment was repeated five times with different random seeds, and the averaged results are reported. Graph Convolutional Networks (GCNs), Graph Attention Networks (GATs), and GraphSAGE were implemented as baseline variants. Each model was initialized using Xavier uniform initialization, with dropout set to 0.5 after each layer. Temporal identity trajectories were constructed post-training by aggregating node embeddings across time slices, allowing for the downstream evaluation of identity coherence, role persistence, and transition dynamics.

4.3. Evaluation Metrics

To assess the effectiveness of the proposed identity modeling framework, evaluation is conducted along two primary dimensions: identity coherence and role detection accuracy. These metrics capture both the temporal stability and structural interpretability of the learned representations across multiple time slices.

4.3.1. Identity Coherence Metrics

Identity coherence measures the consistency with which an agent’s structural representation evolves. This is quantified using temporal embedding similarity and trajectory variance.
Given a node v i and its time-indexed embeddings { z i 1 , z i 2 , z i T } , define the identity coherence score C i as the average cosine similarity between adjacent time steps:
C i = 1 T 1 t = 1 T 1 c o s ( z i t , z i t + 1 )
This score reflects whether the identity trajectory follows a stable path. A higher C i indicates that the agent maintains a consistent role or interaction pattern across time.
Additionally, the identity variance score is computed as:
V i 1 T t = 1 T 1 z i t z ¯ i 2 2 , W h e r e   z ¯ i = 1 T t = 1 T z i t
This variance reflects how much an identity deviates from its temporal mean. Lower variance implies stronger identity convergence or role stability.
Global metrics are reported as averages over all nodes:
A v g C o h e r e n c e = 1 | 𝒱 | i = 1 | 𝒱 | C i ,         A v g V a r i a n c e = 1 | 𝒱 | i = 1 | 𝒱 | V i

4.3.2. Role Detection Accuracy

Role detection accuracy evaluates how well the learned identity embeddings align with structural or semantic roles inferred from ground-truth data. Community labels (for Reddit) and field-specific categories (for AMiner) are used as weak supervision targets. A multi-class classifier is trained on identity embeddings z i t to predict the node’s structural role r i t . The classifier is implemented as a two-layer MLP, trained using cross-entropy loss. Accuracy, macro F1-score, and confusion matrices are used to evaluate performance. In addition to direct classification, unsupervised clustering (e.g., K-means or spectral clustering) is applied to assess whether role-like groupings naturally emerge from the identity space. Cluster assignments are compared to reference labels using Normalized Mutual Information (NMI) and Adjusted Rand Index (ARI). High classification accuracy and alignment in unsupervised metrics indicate that the identity embeddings encode interpretable role information, validating their use in both temporal reasoning and behavioral profiling.

4.4. Quantitative Evaluation

The quantitative evaluation focuses on four aspects of the proposed framework: identity coherence, structural clustering quality, role classification accuracy, and temporal consistency of learned embeddings. All experiments were conducted on both the Reddit and AMiner datasets using GCN, GAT, and GraphSAGE models. The findings affirm the usefulness of network-based identity modelling and emphasize the relative merits of various GNN designs. Table 2 displays identity coherence, which can be defined as the average similarity between node embeddings in successive time slices, measured in terms of cosine similarity. GAT has the largest average coherence score of 0.714 on Reddit, a factor that beats GCNs (0.682) and GraphSAGE (0.691). On AMiner, the coherence is also higher, with a GAT of 0.854, indicating the more enduring character of academic collaboration in the long term. The values of standard deviation are lower on AMiner, suggesting a more uniform identity evolution.
Community detection without supervision was performed using K-means clustering on the learned identity embeddings, and the resulting clusters were assessed in terms of modularity, purity, normalized mutual information (NMI), and adjusted Rand index (ARI) with respect to the known community labels. Table 3 shows the comparative results. The GAT model is again seen to perform better on the AMiner data, particularly in terms of modularity (0.604) and purity (0.748), indicating that it can maintain stable and semantically consistent identity structures in long-term collaboration networks. Overall performance on Reddit is relatively lower because it is dynamic and less structured, with user roles and interactions changing quickly. The latest addition to GitHub Interaction Network, GAT, is equally competitive in terms of modularity (0.582) and purity (0.729), which proves that the given framework can be generalized well across domains with different interaction semantics and temporal densities.
Supervised classification experiments, using community or research field labels as ground truth, are summarized in Table 4. GAT reaches the highest accuracy of 80.2% on AMiner and 71.5% on Reddit. Macro F1-scores are also higher across the board, suggesting that GAT preserves role-level differentiation more effectively than GCNs or GraphSAGE.
Temporal embedding consistency, computed as the cosine similarity between each time step and the mean trajectory vector, is detailed in Table 5. Consistency is significantly higher in AMiner due to its long-term interaction structure, with GAT again achieving the most stable trajectories (0.814 cosine similarity). Reddit shows more fluctuation, reflecting diverse interaction contexts across threads.
Model convergence behavior is visualized in Figure 2, which shows loss curves across epochs for all three models. GAT converges faster and to a lower loss value than GCNs and GraphSAGE on both datasets, indicating better optimization dynamics.
Model performance is further summarized in Figure 3, which displays role classification accuracy as grouped bar charts. GAT achieves the highest accuracy in both datasets, confirming its robustness across domains.
Finally, identity evolution over time is illustrated in Figure 4 using a heatmap of role assignments for the top 100 authors in AMiner. Stable horizontal patterns suggest consistent roles, while fragmented or sloped bands indicate identity transitions, often aligned with entry into new research fields.
This evaluation confirms that the proposed framework, particularly when paired with GAT, captures coherent, structured, and temporally stable representations of agent identity. Subsequent sections further test the robustness of these results via ablation and sensitivity analyses.

4.5. Qualitative Analysis of Identity Evolution

To complement the quantitative results, several qualitative examples were analyzed to illustrate how the proposed framework captures meaningful patterns of identity transformation over time. Representative cases from the Reddit, AMiner, and GitHub datasets were selected based on stable participation across multiple time slices.
On Reddit, users commonly exhibit transitions such as question initiator → bridge participant → moderator-like role. These trajectories reflect gradual increases in interaction centrality and motif participation, as shown by higher attention weights assigned to discussion-bridging neighbors. In the AMiner co-authorship network, the model successfully traces early-career researchers who evolve into core collaborators and later domain experts. Their identity embeddings show increasing coherence and decreasing temporal variance, aligned with long-term professional specialization. Within the GitHub Interaction Network, the framework identifies developers transitioning from code contributor → reviewer → project maintainer. The attention distributions emphasize cross-project collaborations, revealing how relational roles stabilize as contributors gain authority.
These qualitative findings, visualized in Figure 5, highlight that the model not only achieves strong quantitative accuracy but also learns interpretable and context-aware representations of evolving identities across heterogeneous domains.

4.6. Ablation and Sensitivity Studies

To evaluate the resilience and interpretability of the identity modeling framework, a series of ablation and sensitivity studies was conducted. These experiments assess the impact of removing key architectural components, including temporal and motif layers, as well as the system’s stability under perturbations such as random edge dropout, multiplex flattening, and input noise injection.
The role of temporal modeling and motif encoding is examined first. As shown in Table 6, removing either layer degrades identity coherence, clustering quality (NMI), and classification accuracy. The full model achieves an identity coherence of 0.854 and 80.2% classification accuracy, whereas ablating the temporal layer reduces performance to 0.789 and 75.6%, respectively. The motif layer ablation yields intermediate results.
To test structural robustness, random edge dropout is introduced at varying levels (10%, 20%, 30%). Table 7 shows that identity variance increases with edge loss, especially for Reddit, which is structurally noisier. GAT shows the lowest increase in variance, indicating stronger structural generalization than GCNs and GraphSAGE.
The temporal parameter-sharing approach reduced overall training time by approximately 70–76% and memory consumption by 30–40% across datasets presented in Table 8. Additionally, the temporal smoothness regularizer slightly improved identity coherence (+0.006–0.009) without increasing computational cost. The reduction in total epochs stems from reusing model weights across snapshots, eliminating redundant parameter initialization.
The third ablation targets the removal of multiplexity. Reddit’s structure includes multi-subreddit links between users. As seen in Table 9, flattening the network (i.e., removing inter-subreddit links) reduces modularity and role accuracy. Modularity drops from 0.468 to 0.393, and NMI from 0.637 to 0.574, highlighting that multiplex modeling is crucial for identifying users’ context-sensitive roles.
To summarize performance changes across all ablation types, a radar chart is provided in Figure 6. The full model dominates across all axes (modularity, accuracy, coherence, and NMI), while other configurations show radial shrinkage, especially when both motif and temporal layers are removed.
Sensitivity to input noise is further evaluated by injecting Gaussian noise ( μ = 0 , σ = 0.1 ) into node features and monitoring entropy fluctuations in the learned embeddings. As illustrated in Figure 7, GAT with motifs yields the lowest median entropy and tightest spread, while GCN without motifs shows the widest variance.
A comparison with state-of-the-art models is shown in Table 10, highlighting that while traditional embedding or rule-based methods lack temporal and structural depth, recent models like DySAT and T-GCN introduce some dynamics. However, only our proposed framework integrates structural motifs, temporal role shifts, and emergent identity dynamics in a unified model, yielding superior interpretability and accuracy on real-world datasets.
The experimental results confirm the effectiveness of the proposed framework in capturing dynamic, emergent identities within complex interaction networks. Across both Reddit and AMiner datasets, the model demonstrates superior identity coherence, role classification accuracy, and temporal consistency compared to baseline methods. The ablation studies highlight the critical role of temporal and motif-based components, while emergent identity analysis reveals meaningful patterns of stability, transition, and influence. These findings validate the central hypothesis that AI identity is not static but arises through structured and evolving interactions.

5. Conclusions and Future Work

This study presented the Identity Evolution Modeling Framework (IEMF), an unsupervised graph-based approach for analyzing and predicting the evolution of artificial intelligence identities in dynamic interaction networks. The framework combines motif-aware message passing, temporal parameter sharing, and identity coherence metrics to capture how structural and relational patterns change over time effectively. Experiments conducted on three distinct datasets, namely the Reddit Interaction Network, the AMiner Co-authorship Network, and the GitHub Interaction Network, demonstrated the adaptability of the proposed model across social, academic, and professional collaboration domains. Among the evaluated architectures, the Graph Attention Network achieved the best performance due to its ability to assign context-sensitive weights to neighboring nodes, resulting in stable and interpretable identity embeddings.
The inclusion of temporal regularization significantly reduced computational costs by lowering both training time and memory usage while preserving identity coherence and clustering accuracy. Qualitative analyses further illustrated interpretable identity trajectories, such as the transition from an initiator to a moderator on Reddit, from an early-career researcher to an expert in AMiner, and from a contributor to a maintainer on GitHub. These results confirm that the proposed framework effectively captures meaningful patterns of behavioral transformation over time.
Future research will focus on enhancing the framework by integrating content-level semantic features through advanced language representation models in order to complement structural identity modeling and improve interpretability. Ethical and social aspects will also receive greater attention, including privacy preservation, fairness, and responsible deployment of identity modeling systems. The planned incorporation of privacy-preserving learning methods, fairness-aware optimization, and explainable artificial intelligence techniques will help ensure that the framework adheres to the principles of trustworthy and ethical artificial intelligence. Collectively, these future directions aim to establish a comprehensive and responsible foundation for modeling identity evolution across dynamic and diverse environments.

Author Contributions

Conceptualization, Y.L., R.M.M. and R.V.; methodology, Y.L. and R.M.M.; validation, Y.L. and R.V.; writing—original draft preparation, Y.L., R.M.M. and R.V.; writing—review and editing, R.V.; supervision, R.M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

To ensure full replicability and transparency, all code, random seeds, and detailed preprocessing instructions are publicly available at https://github.com/p120626-design/Modeling-the-Evolution-of-AI-Identity-Using-Structural-Features-and-Temporal.git (accessed on 9 October 2025).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Borivoje, B. Decoding identity and representation in the age of AI. Megatrend Rev. Megatrend Rev. 2023, 20, 141. [Google Scholar]
  2. Lynn, C.W.; Bassett, D.S. The physics of brain network structure, function and control. Nat. Rev. Phys. 2019, 1, 318–332. [Google Scholar] [CrossRef]
  3. De Domenico, M. Multilayer modeling and analysis of human brain networks. Giga Sci. 2017, 6, gix004. [Google Scholar] [CrossRef] [PubMed]
  4. Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1025–1035. [Google Scholar]
  5. Jurafsky, D.; Chai, J.; Schluter, N.; Tetreault, J. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 5–10 July 2020. [Google Scholar]
  6. Min, C.; Zhao, Y.; Bu, Y.; Ding, Y.; Wagner, C.S. Has China caught up to the US in AI research? An exploration of mimetic isomorphism as a model for late industrializers. arXiv 2023, arXiv:2307.10198. [Google Scholar] [CrossRef]
  7. Giouroukelis, M.; Papagianni, S.; Tzivellou, N.; Vlahogianni, E.I.; Golias, J.C. Modeling the effects of the governmental responses to COVID-19 on transit demand: The case of Athens, Greece. Case Study. Transp. Policy 2022, 10, 1069–1077. [Google Scholar] [CrossRef] [PubMed]
  8. Smaldino, P.; Pickett, C.; Sherman, J.; Schank, J. An agent-based model of social identity dynamics. J. Artif. Soc. Soc. Simul. 2012, 15, 127–143. [Google Scholar] [CrossRef]
  9. Miritello, G. Temporal Patterns of Communication in Social Networks; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
  10. Ross, S.; Munoz, D.; Hebert, M.; Bagnell, J.A. Learning message-passing inference machines for structured prediction. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 21–23 June 2011; pp. 2737–2744. [Google Scholar]
  11. Wang, Y.; Chen, S.; Chen, G.; Shurberg, E.; Liu, H.; Hong, P. Motif-based graph representation learning with application to chemical molecules. Informatics 2023, 10, 8. [Google Scholar] [CrossRef]
  12. Ghosh, A.; Dhebar, Y.; Guha, R.; Deb, K.; Nageshrao, S.; Zhu, L.; Tseng, E.; Filev, D. Interpretable AI agent through nonlinear decision trees for lane change problem. In Proceedings of the 2021 IEEE Symposium Series on Computational Intelligence (SSCI), Orlando, FL, USA, 20 August 2021; pp. 1–8. [Google Scholar]
  13. Quaranta, G.; Lacarbonara, W.; Masri, S.F. A review on computational intelligence for the identification of nonlinear dynamical systems. Nonlinear Dyn. 2020, 99, 1709–1761. [Google Scholar] [CrossRef]
  14. Xu, Y. Learning with Conversational Agents; University of California Irvine: Irvine, CA, USA, 2020. [Google Scholar]
  15. Castelfranchi, C. Modelling social action for AI agents. Artif. Intell. 1998, 103, 157–182. [Google Scholar] [CrossRef]
  16. Boccaletti, S.; Bianconi, G.; Criado, R.; Del Genio, C.I.; Gómez-Gardenes, J.; Romance, M.; Sendina-Nadal, I.; Wang, Z.; Zanin, M. The structure and dynamics of multilayer networks. Phys. Rep. 2014, 544, 1–122. [Google Scholar] [CrossRef] [PubMed]
  17. Ucer, S.; Ozyer, T.; Alhajj, R. Explainable artificial intelligence through graph theory by generalized social network analysis-based classifier. Sci. Rep. 2022, 12, 15210. [Google Scholar] [CrossRef] [PubMed]
  18. Mitra, A.; Paul, S. Analyzing social networks with dynamic graphs: Unravelling the ever-evolving connections. In Applied Graph Data Science; Elsevier: Amsterdam, The Netherlands, 2025; pp. 195–214. [Google Scholar]
  19. Xu, Y.; Wang, L.; Wang, Y.; Fu, Y. Adaptive trajectory prediction via transferable GNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 6520–6531. [Google Scholar]
  20. Jiang, S.; Huang, Z.; Luo, X.; Sun, Y. CF-GODE: Continuous-time causal inference for multi-agent dynamical systems. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 997–1009. [Google Scholar]
  21. Sankar, A.; Wang, J.; Krishnan, A.; Sundaram, H. Self-supervised role learning for graph neural networks. Knowl. Inf. Syst. 2022, 64, 2091–2121. [Google Scholar] [CrossRef]
  22. Deng, L.; Zhao, Y.; Chen, J.; Liu, S.; Xia, Y.; Zheng, K. Learning to hash for trajectory similarity computation and search. In Proceedings of the 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, The Netherlands, 13–16 May 2024; pp. 4491–4503. [Google Scholar]
  23. Polat, C.; Tuncel, M.; Kurban, M.; Serpedin, E.; Kurban, H. xchemagents: Agentic AI for explainable quantum chemistry. arXiv 2025, arXiv:2505.20574. [Google Scholar]
  24. Custode, L.L.; Iacca, G. Social interpretable reinforcement learning. In Proceedings of the International Conference on the Applications of Evolutionary Computation (Part of EvoStar), Trieste, Italy, 23–25 April 2025; pp. 3–19. [Google Scholar]
  25. Kazemi, S.M.; Goel, R.; Jain, K.; Kobyzev, I.; Sethi, A.; Forsyth, P.; Poupart, P. Representation learning for dynamic graphs: A survey. J. Mach. Learn. Res. 2020, 21, 2648–2720. [Google Scholar]
  26. Kaggle. The Reddit Dataset; Kaggle: San Francisco, CA, USA, 2022. [Google Scholar]
  27. Mader, K.S. AMiner Academic Citation Dataset; Kaggle: San Francisco, CA, USA, 2018. [Google Scholar]
  28. Fu, D.; He, J. GitHub Interaction Network Dataset, GitHub. 2022. Available online: https://github.com/DongqiFu/DPPIN (accessed on 23 March 2025).
  29. Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 2019, 21, 3848–3858. [Google Scholar] [CrossRef]
Figure 1. Methodological Pipeline—From Raw Data to Identity Graph Inference.
Figure 1. Methodological Pipeline—From Raw Data to Identity Graph Inference.
Mathematics 13 03315 g001
Figure 2. Model Convergence—Loss vs. Epoch.
Figure 2. Model Convergence—Loss vs. Epoch.
Mathematics 13 03315 g002
Figure 3. Role Accuracy Bar Chart (GCNs, GAT, GraphSAGE across Reddit and AMiner).
Figure 3. Role Accuracy Bar Chart (GCNs, GAT, GraphSAGE across Reddit and AMiner).
Mathematics 13 03315 g003
Figure 4. Heatmap of Identity Roles Over Time (AMiner).
Figure 4. Heatmap of Identity Roles Over Time (AMiner).
Mathematics 13 03315 g004
Figure 5. Qualitative examples of identity evolution across datasets: (a) Reddit—transition from initiator to moderator-like user, (b) AMiner—evolution from early-career researcher to senior collaborator, and (c) GitHub—progression from code contributor to project maintainer.
Figure 5. Qualitative examples of identity evolution across datasets: (a) Reddit—transition from initiator to moderator-like user, (b) AMiner—evolution from early-career researcher to senior collaborator, and (c) GitHub—progression from code contributor to project maintainer.
Mathematics 13 03315 g005
Figure 6. Radar Chart—Metric Shifts from Ablations.
Figure 6. Radar Chart—Metric Shifts from Ablations.
Mathematics 13 03315 g006
Figure 7. Box Plot of Entropy Fluctuation Due to Parameter Noise.
Figure 7. Box Plot of Entropy Fluctuation Due to Parameter Noise.
Mathematics 13 03315 g007
Table 1. Dataset Characteristics and Statistics.
Table 1. Dataset Characteristics and Statistics.
PropertyReddit Interaction NetworkAMiner Co-Authorship NetworkGitHub Interaction Network
DomainSocial media discussionsAcademic collaborationSoftware development and collaboration
Time Range6 months, 202230 years, 1990–20206 months, 2023
Time Granularity7 days1 year14 days
Number of Time Slices243124
Number of Nodes (V)12,46021,7808200
Number of Edges (E)1,634,2002,957,84267,000
Average Degree13.19.38.1
Node AttributesActivity metrics, subreddit IDPaper count, field diversityRepository participation, contribution count
Edge AttributesReply frequency, thread depthPublication type, venue tierPull requests, issue comments, and co-commit frequency
Interaction TypesReplies, mentionsCo-authorshipsCode reviews, issue discussions, and collaborations
Graph TypeTemporal directed multigraphTemporal undirected graphTemporal directed multigraph
Avg. Motif Participation RateHigh, triadic interactionsMedium, cliques, and star structuresModerate, collaboration loops and review triads
Table 2. Identity Coherence Scores (Cosine Similarity of Adjacent Embeddings).
Table 2. Identity Coherence Scores (Cosine Similarity of Adjacent Embeddings).
DatasetModelAvg. CoherenceStd. Dev.MaxMin
RedditGCN0.6820.1030.9170.456
GAT0.7140.0970.9350.488
GraphSAGE0.6910.0990.9120.467
AMinerGCN0.8320.0760.9610.638
GAT0.8540.0710.9730.653
GraphSAGE0.8410.0730.9640.629
GitHubGCN0.8260.0810.9470.619
GAT0.8420.0780.9580.631
GraphSAGE0.8330.0800.9510.624
Table 3. Cluster Modularity and Purity Comparison.
Table 3. Cluster Modularity and Purity Comparison.
DatasetModelModularityPurityNMIARI
RedditGCN0.4120.6520.6030.541
GAT0.4680.6810.6370.572
GraphSAGE0.4350.6640.6150.558
AMinerGCN0.5780.7210.6890.663
GAT0.6040.7480.7130.685
GraphSAGE0.5910.7360.7010.674
GitHubGCN0.5630.7120.6760.648
GAT0.5820.7290.6920.661
GraphSAGE0.5710.7180.6840.654
Table 4. Role Classification Accuracy (%).
Table 4. Role Classification Accuracy (%).
DatasetModelAccuracyMacro F1PrecisionRecall
ReddiGCN68.166.267.565.1
GAT71.569.670.968.3
GraphSAGE69.467.168.166.2
AMinerGCN76.874.275.173.5
GAT80.277.978.877.2
GraphSAGE78.575.676.774.9
Table 5. Embedding Consistency Across Time (with respect to. Mean Embedding).
Table 5. Embedding Consistency Across Time (with respect to. Mean Embedding).
DatasetModelAvg. Cosine SimStd. Dev.MaxMin
RedditGCN0.6190.1140.9020.391
GAT0.6420.1080.9210.418
GraphSAGE0.6340.1110.9130.405
AMinerGCN0.7910.0840.9500.588
GAT0.8140.0780.9630.612
GraphSAGE0.8020.0810.9570.601
Table 6. Performance After Removing Temporal/Motif Layers.
Table 6. Performance After Removing Temporal/Motif Layers.
ConfigurationIdentity CoherenceRole Accuracy (%)NMIAvg. Variance
Full Model0.85480.20.7130.072
w/o Temporal Layer0.78975.60.6540.105
w/o Motif Layer0.80177.10.6670.089
w/o Both (Baseline GCNs only)0.74272.80.6110.118
Table 7. Identity Variance Under Random Edge Dropout.
Table 7. Identity Variance Under Random Edge Dropout.
DatasetModelDropout (%)Avg. VarianceStd. Dev.
RedditGCN100.0910.015
300.1380.024
GAT100.0720.013
300.1060.019
AMinerGCN100.0560.008
GAT100.0450.007
Table 8. Efficiency Evaluation: Runtime and Memory Analysis Before and After Temporal Parameter Sharing.
Table 8. Efficiency Evaluation: Runtime and Memory Analysis Before and After Temporal Parameter Sharing.
DatasetTraining StrategyEpochs Per SliceTotal EpochsAvg. Training Time (Per Epoch)Total Runtime (h)Peak GPU Memory (GB)Temporal Consistency (ΔCoherence)Storage Usage (GB)
RedditIndependent per-slice training50120038 s12.76.49.8
Shared-weights + temporal regularization5030036 s3.14.2+0.0073.4
AMinerIndependent per-slice training80248054 s37.28.915.6
Shared-weights + temporal regularization8062050 s9.16.3+0.0096.8
GitHubIndependent per-slice training60144041 s16.47.211.2
Shared-weights + temporal regularization6036039 s4.15.0+0.0064.5
Table 9. Impact of Multiplex Removal on Reddit.
Table 9. Impact of Multiplex Removal on Reddit.
ConfigurationModularityRole Accuracy (%)Identity CoherenceNMI
Full Multiplex0.46871.50.7140.637
No Inter-subreddit Edges0.39366.90.6630.574
Table 10. Comparison with State-of-the-Art Identity Modeling Approaches.
Table 10. Comparison with State-of-the-Art Identity Modeling Approaches.
Model/MethodIdentity Type ModeledTemporal ModelingStructural AwarenessInterpretabilityDataset UsedAccuracy/Metric
Rule-Based Agents [13]Symbolic, fixed roles☒ No☒ None☑ HighSimulated dialoguesN/A
User2Vec [15]Latent vector embeddingsImplicit (via logs)☒ None☒ LowE-commerce, forumsHit@10: 34.5%
DeepWalk/Node2Vec [21]Structural identity (static)☒ No☑ Local neighborhoodsMediumCitation networksRole Clustering F1: 62%
Role2Vec [22]Structural roles (static)☒ No☑ Motif-aware☑ModerateSocial networksModularity: 0.38
Temporal GCN (T-GCN) [29]Temporal embeddings☑ Yes☑ ModerateLowTraffic, social dataMAE ↓: 2.31
DySAT [25]Dynamic graph embeddings☑ Yes☑ Structural + TemporalMediumSocial & co-author dataAccuracy: 77.1%
Proposed Model (GNN+Motif+Temporal)Emergent identity roles☑ Full☑ High (motifs, communities)☑ Moderate–HighReddit, AMinerRole Accuracy: 80.2%, Coherence: 0.854
☑ indicates the presence of the feature; ☒ indicates its absence; and ↓ denotes a lower or reduced value.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lu, Y.; Mydin, R.M.; Vengadasamy, R. Modeling the Evolution of AI Identity Using Structural Features and Temporal Role Dynamics in Complex Networks. Mathematics 2025, 13, 3315. https://doi.org/10.3390/math13203315

AMA Style

Lu Y, Mydin RM, Vengadasamy R. Modeling the Evolution of AI Identity Using Structural Features and Temporal Role Dynamics in Complex Networks. Mathematics. 2025; 13(20):3315. https://doi.org/10.3390/math13203315

Chicago/Turabian Style

Lu, Yahui, Raihanah Mhod Mydin, and Ravichandran Vengadasamy. 2025. "Modeling the Evolution of AI Identity Using Structural Features and Temporal Role Dynamics in Complex Networks" Mathematics 13, no. 20: 3315. https://doi.org/10.3390/math13203315

APA Style

Lu, Y., Mydin, R. M., & Vengadasamy, R. (2025). Modeling the Evolution of AI Identity Using Structural Features and Temporal Role Dynamics in Complex Networks. Mathematics, 13(20), 3315. https://doi.org/10.3390/math13203315

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop