Toward Verification of DAG-Based Distributed Ledger Technologies through Discrete-Event Simulation

As the potential of directed acyclic graph (DAG)-based distributed ledgers in IoT systems unfolds, a need arises to understand their intricate dynamics in real-world scenarios. It is well known that discrete event simulations can provide high-fidelity evaluations of protocols. However, there is a lack of public discrete event simulators capable of assessing DAG-based distributed ledgers. In this paper, a discrete-event-based distributed ledger simulator is introduced, with which we investigate a custom Python-based implementation of IOTA’s Tangle DAG protocol. The study reveals the dynamics of Tangle (particularly Poisson processes in transaction dynamics), the efficiency and intricacies of the random walk in Tangle, and the quantitative assessment of node convergence. Furthermore, the research underscores the significance of weight updates without depth limitations and provides insights into the role, challenges, and implications of the coordinator/validator in DAG architectures. The results are striking, and although the findings are reported only for Tangle, they demonstrate the need for adaptable and versatile discrete event simulators for DAG architectures and tip selection methodologies in general.


Introduction
The Internet of Things (IoT) is ushering in a new era characterized by the interconnectivity of billions of devices.With this vast network comes the critical challenge of ensuring data integrity and decentralization.While blockchain technology has been lauded for its unparalleled characteristics of integrity, decentralization, and auditability, its linear data structure has shown potential limitations in meeting the high transactional demands of IoT systems.
DAG-based distributed ledgers represent a significant evolution from traditional blockchain technology.Unlike blockchains, which operate on a linear sequence of blocks starting from a genesis block, DAGs expand in a non-linear, multi-dimensional manner.This structure allows multiple blocks to be added to the network simultaneously, leveraging the DAG data structure, where each node can represent a block or a single transaction, depending on the specific architecture of the blockchain in question.This capability for parallel transaction processing markedly enhances the scalability and makes DAG-based systems particularly well suited for high-volume transaction environments, such as those found in the Internet of Things.
DAG-based ledgers like Phantom, IOTA [1], Nano [2], and Hashgraph [3] have showcased the vast potential of this technology by facilitating faster transaction speeds and lower operational costs, thereby addressing some of the scalability and efficiency challenges faced by conventional blockchains.However, it is crucial to acknowledge the limitations and challenges that DAG-based systems encounter, including network security, consensus when it comes to DAG-based blockchain systems, such efforts are notably scarce in the literature, highlighting a gap in research and development for these types of distributed ledgers.
The article [6] presents a simulation framework for the study of DAG-based cryptocurrencies, specifically focusing on IOTA.This framework models how transactions occur and are accepted in such systems by simulating the behavior of both honest and semi-honest actors.The study finds that agents (or nodes) in the network with low latency and high connectivity have a better chance of having their transactions accepted.The framework has been designed with extensibility in mind, allowing for the inclusion of other DAG-based protocols and the potential addition of malicious agents in the future.However, while this is a commendable effort, the approach emphasizes solely thread-based implementation.As outlined in Section 3.1, it is limited to utilizing eight threads due to system constraints.In contrast, the methodology presented in our work encompasses not only thread-based implementation, but also asynchronous tasks.Therefore, if researchers have access to advanced supercomputers like NCI Gadi, our approach can be effortlessly expanded to multi-CPU configurations, enabling the generation of more realistic and comprehensive results.
Another study [14] investigates the performance and scalability of IOTA.Using an extended version of the DAGsim simulator [6], the study delves into factors such as the transaction arrival rate, tip selection algorithms, and network delay to provide insights into IOTA's performance.Further, another DAG-based simulator, TangleSimulator, has been proposed [15], focusing on the stability of tip counts against various tip selection methods.An additional study [16], builds upon the TangleSimulator, offering enhanced configuration options and a larger number of transactions.This work illustrates and examines the fundamental components of Tangle.However, similar to the study in [6], it also demonstrates smaller configuration parameters.Another study [7] introduces an agent-based simulator focusing on the performance of the Tangle 2.0 protocol under various network environments and attack scenarios.This study provides a comprehensive understanding of Tangle, yet it focuses exclusively on IOTA.Furthermore, MAIOTASim [17] also proposes a multi-agent IOTA simulator, providing the security verification of consensus under double spending attack scenarios.However, this work also only focuses on the IOTA protocol, unlike our proposed simulator, designed with generic components for overall DAG-based distributed ledgers.
A notable work by [18,19] provides a comprehensive empirical analysis of IOTA's Tangle using real transaction data officially released by the IOTA Foundation.Their study showcases Tangle's topological features and observed performance, contrasting it with the prevailing literature's conclusions.Specifically, they shed light on the actual transaction confirmation time and the topological characteristics of real IOTA tangles, which differs from commonly held beliefs about IOTA's efficiency compared to traditional blockchains.Guo et al.'s [18] analysis illuminates certain latency issues with Tangle, highlighting that the actual transaction confirmation time may not be as efficient as once assumed.This work is instrumental in presenting an empirical perspective on Tangle, emphasizing the importance of real-world data in evaluating decentralized platforms.However, while their work offers profound insights into Tangle's empirical analysis, our study delves deeper into the practicality of the IOTA protocol.Our research focuses on the finest discrete events occurring in Tangle, providing a more holistic view of the IOTA protocol's behavior in high-fidelity simulation environments, which can serve as a robust baseline for future investigations into DAG-based distributed ledgers.
In summary, the research landscape on IOTA's Tangle and DAG-based distributed ledgers has seen a spectrum of investigations, ranging from empirical analyses to simulationbased studies.While these studies have provided invaluable insights into the workings and challenges of the IOTA protocol, there remains substantial scope for enhanced methodologies that transcend traditional approaches.This research endeavors to fill this gap, offering a novel perspective and approach that further elevates the understanding of the Tangle protocol.A comparison of the DAG-based simulators is presented in Table 1.Agent-based and discrete-event ✓

Simulator Implementation with Tangle Protocol
The following section delineates the step-by-step process followed to develop the simulator curated with the Tangle protocol, as presented in its white paper [1].This handson approach aimed to replicate the theoretical underpinnings in a real-world environment to ensure the practical applicability and operability of Tangle.

Environmental Setup
The implementation of the simulator and Tangle protocol was constructed using Python, chosen for its ease of use, clear syntax, and comprehensive libraries that align with the specific needs of distributed ledger technology.By leveraging Python's asyncio [20] and Threading libraries [21], asynchronous network communication between nodes was effectively handled, ensuring smooth and concurrent node connections.
Python's (3.10) hashlib and RSA [22] libraries, renowned for their robust cryptographic hash functions, were utilized to create unique and secure transaction identifiers.For management and operating on the DAG structure, data handling libraries like pandas and NumPy [23] were employed, simplifying the intricate tasks of data manipulation and processing.
For the visual analysis and debugging of the DAG, NetworkX [24] , a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks, was used.In conjunction, matplotlib [25] and Graphviz [26] provided a flexible method for the visualization of the data, offering insights into the state of the network at any point and aiding in the debugging and optimization processes.

Tangle's Core Structure
Tangle is the foundational data structure for IOTA.Unlike traditional blockchains that operate in linear chains of blocks, Tangle is built on a DAG data structure.Each vertex in this DAG represents a transaction, as presented in Figure 1.The structure allows multiple transactions to be added simultaneously; there is no need for miners as the transactions are initially confirmed by the new incoming transaction, ensuring zero transaction fees.This setup is vital for the scalability and micro-transactions on the IOTA network, especially pertinent for IoT applications.Implementing Tangle required the creation of a representation of a DAG in the system.Each transaction or vertex in Tangle contained essential data such as its ID, the IDs of the two transactions that it confirmed (previous vertices), and other transaction-related information.To represent Tangle in this study, a model was developed with the following components.

DAG Representation
The DAG configuration of the network was developed utilizing the NetworkX library.Individual graph representations were generated for each node in the network, offering a localized view of Tangle from different nodes' perspectives.

Transaction Management
The transaction class handled transactions within the network, encapsulating the characteristics and behaviors of individual transactions.Each transaction had unique identifiers, parent references, an associated node, an accumulated weight, and a timestamp.The class ensured the authenticity and integrity of transaction data through signature mechanisms using the RSA algorithm.An important part of the process was checking that each transaction had performed a certain amount of computational work, known as proof of work.Furthermore, every transaction was assigned its own weight (OW) and accumulative weight.Moreover, a confirmation status flag was also initiated to indicate the transaction's confirmation status, which would later be used in coordinator validation.

Accumulative Weight
In Tangle's architecture, the accumulative weight is a very important metric.This weight is the sum of the transaction's OW (which is set to 1 for simplicity) and the OW of all transactions that directly or indirectly reference it.The system adopted a topological sequence for transaction processing by prioritizing parent transactions over their respective descendants.
The accumulative weight of a transaction was dynamically updated during its processing.Let W x be the own weight of transaction x, C x be its set of child transactions, and AW x be its accumulative weight.The relationship between these parameters is given by This equation signifies that the accumulative weight of a transaction is the sum of its own weight and the accumulative weights of its child transactions [1] (Section 2).Transactions without children have their AW equal to their OW, while, for those with children, their children's accumulative weights contribute to the parent's overall weight.This meticulous approach to computing the accumulative weight ensures that the metric authentically reflects a transaction's relative importance and influence within Tangle.

Asynchronous Simulation
In the simulation, we mimicked how a node works in the network, especially how it continuously adds transactions following a Poisson distribution.We used an asynchronous model for this, which captured the network's nature of handling many operations at once.

Coordinator
In Tangle, the coordinator emerges as a central figure, ensuring transactional integrity.In the simulation, the coordinator was initiated with specific parameters, such as the 'mile-stones_interval' and ID, which was configured with all nodes so that they could verify that the milestone was issued by a legitimate coordinator.The coordinator's primary responsibility was to create the foundational 'genesis_milestone', a unique transaction without predecessors that acts as a root transaction of the DAG.Moreover, the coordinator efficiently broadcast this milestone across the network.The coordinator guaranteed that milestones were consistently generated at regular intervals.

Random Walk
In Tangle, every new transaction was tasked with referencing two previous transactions.This reference protocol was directed by the tip selection algorithm (TSA).The TSA employed the random walk technique to ascertain consensus on transaction confirmations.Given this algorithm, the most recent transactions, labeled as 'tips', inherently had an augmented likelihood of attaining approval from forthcoming transactions.This implementation aligns with the principles outlined in Section 4 of the Tangle white paper [1].The random walk serves as the central mechanism for the tip selection process in Tangle.It comprises two primary walks: the unbiased random walk (URW) and the biased random walk (BRW).The choice between URW and BRW relied on the parameter α, with the URW being chosen when α = 0; otherwise, the BRW was initiated.This α value determined the degree of bias toward transactions with a higher AW.
The BRW used the Markov Chain Monte Carlo (MCMC) approach, and the transition probability from one transaction to the subsequent one, progressing toward the tips, was given by where • P xy represents the transition probability; • H x stands for the cumulative weight of the current transaction; • H y stands for the cumulative weight of the transaction toward which the random walker intends to move; • α is a bias parameter that determines the degree to which the random walk is biased toward transactions with a particular cumulative weight.

Randomwalker Configuration
The 'RandomWalker' class was responsible for performing the random walk.Its configuration parameters included the following.
• 'W': defines the time interval, emphasizing the importance of recent transactions; • 'N': determines the number of walkers to be deployed for the consensus process; • α: a bias parameter to influence the walker's decision in favoring transactions with specific cumulative weights; • 'node': identifies the specific node in the network initiating the random walk.

Interval Determination
The system checked how old the DAG was compared to a set time interval.If it was newer, the period was changed.Then, transactions that happened during a specific time (from W to 2W s) were collected and used in the random walk.

1.
Transaction Selection : A random subset of transactions, sized 'N' (random walkers), from the chosen interval, initiated the random walk.

2.
Dispatching Walkers: Each transaction underwent an independent walk toward the tips.This process was based on asynchronous programming and allowed all tasks (random walkers) to happen at the same time.

3.
Walk Type Determination: The URW was chosen if 'alpha_low' equaled 0, signifying a uniform transaction transition.Otherwise, the BRW was employed, and the transition probability was calculated based on Equation (2).

4.
Tip Selection: Post-walk, if fewer than two unique tips were reached, additional ones were randomly picked.The tips were subsequently sorted, and the first two were selected.
This implementation presented a robust random walk mechanism for Tangle, based on firm theoretical underpinnings.

Network Formation and Dynamics
In our designed simulator, we developed a network with a versatile framework that integrated functions such as create_peer, latency management, transaction spread, solidification, and methods to prevent redundancy.These functionalities were wide-ranging and adaptable, suitable for a variety of DAG-based distributed ledgers.Although this adaptability was a significant attribute, the current configuration was specifically optimized for IOTA's Tangle network.Our system effectively mirrored the key features of Tangle, including its decentralized structure, ability to scale, and distinctive transaction validation process.

Node Connectivity
The create_peers method was responsible for crafting the network's structure.In this setting, the nodes were not fully connected.Instead, each one interacted with an adjustable probability of its counterparts, providing an optimal combination of efficiency and resilience.However, to guarantee the integrity of the network and provide reference milestones, the coordinator was connected to all nodes (adjustable).

Latency Modeling
Real-world network communication involves unpredictable and varied delays.This behavior was replicated in our framework through the generate_delay_matrix method.It assigned probabilistic delay values to potential interactions between nodes, based on a predefined range.

Gossip Transaction Mechanism
Transactions were propagated through a gossip mechanism, as illustrated by the gos-sip_transaction method.A node initiated a transaction, sending it to a subset of its peers.This broadcast strategy, dictated by the subset_factor, ensured strategic and controlled propagation.

Latency Consideration
Transaction propagation delays were controlled by the latency values derived from the delay matrix.Each transaction endured a unique delay, based on its source and destination nodes defined in the delay_matrix.

Recursion and Network Penetration
The gossip transaction mechanism employed a recursive strategy.When a node received a transaction, it became a sender, propagating the transaction to its peers.This iterative approach guaranteed the transaction's penetration throughout the network.

Avoidance of Redundancy
To prevent redundant transactions for a single node, we implemented conditional checks that were activated when the node received a transaction.This ensured that nodes did not re-process transactions that they had already encountered.
In summary, this approach, integrating node connectivity, latency modeling, and a gossip-based transaction mechanism, lays a foundation for future research in DAG-based distributed ledgers.

Node Functionality
In DAG-based distributed ledgers, nodes are crucial in creating, validating, and broadcasting transactions.These nodes support the network's expansion, maintaining the network's continuity and integrity.Their functions are defined by the details of their algorithmic processes.The system described in this study, although based on the Tangle protocol, also shares similarities with other DAG-based distributed ledgers.In this study, we developed a node class encompassing the essential operations of a node, including create_and_sign_transaction, receive_transaction, broadcast_transaction; these functions leverage the network class for transaction propagation mechanisms like create_peers and gossip_transaction, etc.The methods outlined are not only applicable to the Tangle system under study but also extendable to other DAG-based systems.

Transaction Creation and Signing
In our implementation, a node initiates a transaction by generating unique data.This process involved creating a random data string of N bytes, which was then decoded using the 'latin1' encoding.The transaction, represented by a 'Transaction' object, was identified by its transaction ID ('txid') and included references to two-parent transactions, forming the DAG structure.Furthermore, security was enforced through a cryptographic proof of work (PoW), and the node signed the transaction with its private key using RSA and SHA-256 hashing, ensuring data integrity.The transaction was then added to the node's list of transactions and unconfirmed transactions and was also considered a new 'tip' in the network.Moreover, the node performed a check to remove any referenced parent transactions from its list of tips.This action ensured that only new tips were retained in the network.The code implementation effectively captured the intricate process of transaction creation, signing, and management within the network.

Transaction Broadcasting and Reception
Once a transaction was created and signed, the node broadcast it to its peers in the network.This propagation ensured the decentralization and redundancy of transactional data across the network.While broadcasting, a delay might have been imposed to simulate network latency, as established by the network's delay_matrix.When a node received a new transaction, it performed a series of checks to ascertain its validity.These checks included examining whether the received transaction or its parent transactions had been previously seen or were part of the node's transaction list.Furthermore, if the parent transactions were missing, the node requested them, ensuring a holistic understanding of the transaction's ancestry in the network.It is worth noting that these processes represented more of an operational or implementation detail rather than a core theoretical concept, as presented in the Tangle white paper.This approach of retrieving missing pieces for holistic data understanding is also seen in other peer-to-peer systems, showcasing its utility and pragmatic nature [27].

Milestone Receipt and Validation
Nodes in the implemented Tangle occasionally received milestones (critical markers signifying the consensus state of the network).These milestones could be genesis_milestones or issued after a particular interval.When a node received a milestone, it underwent rigorous validation.It checked the milestone's signature using the coordinator's public key to confirm its legitimacy.If the milestone passed this cryptographic scrutiny, the node integrated it into its local view of Tangle.

Transaction Confirmation
Tangle's consensus mechanism revolved around the confirmation status of a transaction.Transactions encapsulated within milestones and passing the network's consensus rules were deemed 'confirmed', ensuring their permanent inclusion within Tangle and testing their validity.

Double Spending Detection
To ensure the security of the network, nodes implemented specific mechanisms to identify and prevent double spending attempts.The is_double_spent function played a crucial role in this process.It operated by examining each incoming transaction against the node's existing list of transactions.The function checked if the transaction attempted to use funds from an address that had already been spent or if the transaction ID matched any existing transaction in the node's transaction list.Furthermore, the function extended its verification to the order of the transaction.It inspected not only the parents of the transaction but also their preceding transactions.By iterating through this chain of parent transactions, the function determined if the transaction's ID appeared at any point in this order.If a match was found, indicating the reuse of the same transaction ID, the function identified a double spending attempt.This comprehensive check ensured both direct and ancestral transaction comparisons and effectively safeguarded the integrity of the ledger state, ensuring that each transaction was unique and funds were not illicitly reused.

Pending Transaction Management
To effectively handle the missing transactions and the dynamic nature of the network, nodes maintained a queue of pending transactions.The process_pending_transactions method allowed nodes to continuously assess and process these transactions.Transactions were routinely checked, and those successfully processed were removed from the queue, ensuring that the system remained responsive and efficient.In addition, nodes were equipped with a method, request_parent_transaction, to retrieve missing parent transactions, essential in maintaining Tangle's continuity.If a node encountered a transaction with a missing parent, it requested the missing transaction from its sender or peers.The corresponding method, provide_missing_transaction, enabled a node to locate a requested transaction in its list of transactions and forward it to the requester, ensuring the network's integrity and flow of information.

Tangle Evaluation Results and Analysis
This section presents the outcomes of the simulations conducted on the specified components, emphasizing the significance of Poisson processes in the transaction dynamics, the analysis of random walks, node convergence, and the development of accumulative weights over time.These findings demonstrate the effectiveness of the simulator in analyzing the Tangle protocol, which was the primary protocol evaluated using our proposed simulator.

Network Analysis of Poisson Processes in Transaction Dynamics
In a decentralized network based on a DAG, nodes independently create and process transactions, primarily influenced by Poisson processes [14].It is important to note that this analysis underscores the significance of discrete event simulators in such environments.While the studies referenced in Section 2 contribute to a collective understanding, they typically presume the network's adherence to a Poisson distribution without empirical analysis.Our proposed simulator, however, allows for a meticulous, quantitative examination of this distribution, moving beyond mere assumptions to a more definitive analysis.For a given node i, the transaction generation rate can be denoted as λ i .This leads to each node potentially generating transactions at a rate of λ transactions per second.Mathematically, the number of transactions X i generated by node i in time t conforms to X i ∼ Poisson(λ i t).Summatively, the aggregate number of transactions by all nodes in the network during time t is X = ∑ N i=1 X i .This summation upholds the Poisson principle, making X Poisson-distributed with the parameter λ = ∑ N i=1 λ i .The reception dynamics encompass network delays and peer-to-peer structures, yielding the effective reception rate at node j as ( where P(i → j) signifies the likelihood of a transaction from node i reaching node j.With Y j denoting the transaction counts for node j, it is described as Y j ∼ Poisson(λ j e f f ective t).
Network delay complexities might warrant a representation like with D(j) being the mean delay for node j.The simulation approach relies on progressive state updates across nodes and involves iterating over each node to refresh the transaction state, symbolized by S(t + 1) = f (S(t), λ, N). ( After the simulation, the mean transaction count per node can be computed as µ = 1 N ∑ N j=1 Y j and the observed count compared against a standard Poisson distribution using for k = 0, 1, 2, ... up to the maximum observed transaction count.This provides a comprehensive picture of the simulation approach.It not only models the dynamic behavior of the network but also statistically evaluates the observed results against the expected Poisson distribution and underscores its utility in understanding decentralized transaction dynamics.The parameter settings utilized in this analysis are summarized in Table 2.The interplay between these parameters and their influence on the network dynamics offer a richer insight into the decentralized transaction processes.Specifically, the range of delay values was chosen based on empirical evidence from IoT systems.According to [28], the delays in such systems predominantly range from 0.20 ms to <50 ms under typical circumstances.This analysis also accounted for extreme scenarios, pushing the upper limit to 90 ms to ensure a thorough evaluation of the system under potential edge conditions.Such granularity in the parameter choices ensured that the simulation captured a wide spectrum of real-world scenarios, enhancing the robustness and validity of our findings.In prior studies [29], λ has been traditionally considered as an upper bound, with values spanning between 100 and 10,000 transactions per unit of time, thereby deriving the transactions per node based on this aggregate upper limit.This methodology predominantly emphasizes a more holistic and macroscopic analysis.Contrarily, our implementation pivots toward a granular and discrete-level examination of individual entities within the network.Consequently, we tailored our analytical framework to scrutinize λ on an individual node level, with the transactions per second being categorized into two distinct ranges: 0.001-0.01and 0.01-0.1.This nuanced approach offers a more microscopic perspective, shedding light on the intricate dynamics at play on a per-node basis.The second aspect of our analysis using the proposed simulator focused on the random walk strategy within the Tangle protocol.It is important to clarify that the Tangle version examined is not the latest iteration released by IOTA [4].The primary objective of this study is to establish the foundational elements of the simulator, which are designed to be adaptable to any DAG-based system.While the simulator can be extended to newer protocol versions, our choice to analyze this particular version is strategic.The aim is not merely to scrutinize the current Tangle iteration but to explore the fundamental factors that led IOTA to transition from this version to newer ones.We focus on the 'what' and 'why' of these changes, rather than the 'how', as the latter extends beyond the scope of this study.With these considerations in mind, we present an in-depth analysis of the random walk strategy employed in the Tangle protocol.
Based on the provided graphical representations in Figures 3 and 4, the behavior of the random walk mechanism over different configurations and intervals can be observed.The graphs illustrate the 'duration for one node' concerning the 'transaction count' under various settings of N, α, and W.
Across both graphs, as the number of transactions increases, the duration for one node also rises linearly, suggesting a directly proportional relationship between the two.Furthermore, as N, the number of walkers, increases (from 2 to 6), the duration required for a single node also rises.This might be indicative of the overhead introduced by managing more walkers, even though more walkers would ideally mean a faster consensus.The variations in α, the bias parameter, showcase different trends in duration.A higher α seems to lead to a faster duration, especially evident for larger transaction counts.This implies that a higher degree of bias expedites the random walk process by driving it toward transactions with higher cumulative weights more quickly.Comparing the graphs, it can be noticed that the duration for one node is generally lower for intervals of W = 120-240 s than for W = 60-120 s under similar conditions, suggesting that shorter intervals lead to a quicker random walk process.The influence of the number of walkers becomes more prominent as the bias parameter α decreases.In the graph with W = 60-120 s, the duration difference between N = 2 and N = 6 is more discernible for α = 0.0 compared to α = 0.01.With the longer interval of W = 120-240 s, the increase in duration for one node is more gradual across the transaction count, indicating that the system may have better efficiency or a smaller overhead over this extended timeframe.As outlined in Section 3, the asynchronous nature of dispatching walkers and determining the walk type likely contributes to the linear increase in duration.As more transactions are initiated, more walkers are dispatched simultaneously, leading to a consistent rise in the time taken to select tips.
The random walk mechanism, integral to Tangle's operation, exhibits predictable behavior across varied configurations.While the linear relationship between the transaction count and duration underscores its scalability, the influences of N, α, and W highlight the intricacies of its operation.By understanding these nuances, optimizations can be made to further enhance the efficiency and responsiveness of the system in real-world applications.The foundational challenges highlighted in this study, through the use of the proposed simulator, offer a clear rationale for and understanding of the shift to the newer version of the Tangle protocol.

Quantitative Assessment of Node Convergence
In heterogeneous DAG networks, comprehending the dynamics of node convergence and synchronization and attaining a consistent state across the network are crucial in maintaining optimal network performance.This section clarifies the methodologies utilized in our proposed simulator to assess this synchronization.The focus is particularly on two key aspects: 'tips', which are the latest transactions that are yet to be approved, and 'all_transactions', representing the sum of all transactions received and generated by a node.
A critical initial step in this process was to clearly define the parameters for convergence.Determining whether nodes have reached convergence involves evaluating whether they have attained synchronization and a consistent state in terms of their transactional data.This was essential in ensuring that all nodes in the network possessed a coherent view of the ledger.
To quantitatively measure this synchronization and convergence, we introduced the pairwise overlap and convergence metric.This metric was designed to quantify the extent of overlap in the 'tips' and the convergence of 'all_transactions' across every pair of nodes within the network.The degree of overlap and convergence was expressed through specific equations, where each cell in the matrix, denoted as (i, j), represents the degree of overlap for 'tips' and convergence for 'all_transactions' between node i and node j.The formulas are given by TipsOverlap(i, j) = |Tips of Node i ∩ Tips of Node j| NodeConvergence(i, j) = Overlap(i, j) TotalTransactions ( 8) To evaluate the average overlap and convergence, we calculated the mean overlap and convergence that each node had with the rest of the network.This provides insights into the mean 'tips' overlap and 'all_transactions' convergence for individual nodes with their peers: AvgNodeConvergence For a normalized measure of overlap and convergence, we used the Jaccard Similarity, which evaluated the shared elements against the total unique elements between node pairs.This provided a more standardized approach to understanding overlap and convergence: JaccardNodeConvergence(i, j) = Overlap(i, j) where Utilizing these methodological approaches in the proposed simulator provides a comprehensive understanding of node overlaps and convergence, essential in evaluating inter-node relationships in a DAG network.The precise application of these formulas is critical for accurate and standardized analysis, forming a fundamental aspect of any study in decentralized networks.
Table 3 presents an in-depth analysis of the convergence tendencies within a DAG, focusing on nodes' behavior.The heterogeneous nature of DAGs poses a significant challenge in achieving convergence, as each node might have a distinct view of the network.Key parameters for this study included a gossip factor of 0.7, a probability for peers at 0.4, varying λ values from 0.01 to 0.1, and a total simulation duration of 3600 s (1 h).The results indicated that the 'average overlap' and 'Jaccard similarity' for all transactions generally were around 23% and 21%, respectively.While this uniformity seems promising, it raises concerns in the context of heterogeneous DAGs, questioning the network's ability to reach a fully converged state.This finding highlights the inherent challenges in achieving a synchronized perspective on the DAG among nodes, exacerbated by the natural tendency of DAGs to support diverse node viewpoints.In addressing these convergence challenges, IOTA's implementation of the coordinator milestone is noteworthy.This mechanism serves as a reference point for nodes, aiding in harmonizing their views of the network, as discussed in [30].Moreover, the new version of IOTA, which does not rely on milestones and coordinators [4], uses voting and a validator committee.However, this structure leads to thought-provoking questions: if convergence and a consistent network state are achieved through overarching mechanisms like the coordinator or validator committees, why is there a need for each node to independently engage in tip selection and perform PoW?Additionally, why is it essential for each node to maintain its own version of the DAG, especially in a network where a synchronized state is the objective?

Accumulative Weight Growth
The final metric that we tested using our proposed simulator was the accumulative weight, a critical aspect of the Tangle protocol.As Tangle evolves, efficient weight calculations become increasingly important.An evident observation from Figure 5 is the exponentially increasing duration of weight updates with the increasing size of Tangle.This progression indicates that as more transactions are added, the computational time to update the weights correspondingly increases.The colors in the figure map to parameters W, α, and N from Figure 3, highlighting that the weight calculation times are influenced by these configurations.Each node updates the weights locally, tied directly to the incoming tips, or λ.This underscores the direct proportionality of the weight computation time to λ.The increasing duration signifies the computational challenges.Therefore, future iterations and optimizations of Tangle might require strategies to streamline this process and ensure rapid weight updates despite an escalating number of transactions.

Discussion and Future Work
This section analyzes the complexities of updating weights and consensus in IOTA's Tangle, focusing on the implications of the queue depth, memory needs, and network latency.We discuss the challenges of decentralizing consensus, including the system overhead, node heterogeneity, and validator diversity.Additionally, we highlight the study limitations and suggest future research directions, such as investigating security threats, exploring alternative tip selection algorithms, and considering a multi-validator approach.

Weight Update without Depth Limitation
Based on Sections 3.2.3 and 4.4, the accumulative weight plays an important role in the confirmation of a transaction.As outlined in the IOTA white paper, the accumulative weight is designed to incrementally increase as new transactions are added to the DAG.To further investigate the impact of accumulative weight increments, we consider T as the total number of transactions in Tangle up to a given time and t to indicate that a new transaction is added.Each transaction confirms two previous transactions.If we assume the worst-case scenario without depth limitation, the weight of each transaction would need updating till genesis.Thus, for a transaction t, the total number of weight updates, U t , would be where λ i represents the weight contribution of the ith transaction to t.

Time Complexity with Queue Depth
Transactions are not processed immediately but are placed in a queue.Let us denote the queue depth (the number of transactions waiting to be processed) as Q.
The time to update the weights, assuming an average queue processing time of δ per transaction, would be where k is the average time taken to update a single transaction's weight.Since Q can vary dynamically based on the network load and other factors, the time complexity becomes a function of both T and Q: This implies higher variability in transaction confirmation times as Tangle grows.

Memory Requirements with Shard Overhead
Considering the presumption that each node computes the accumulative weight locally, the node must maintain a complete copy of the DAG up to the genesis block, thereby raising potential memory concerns.Nonetheless, several techniques, including sharding [31,32], have been proposed to address these challenges.With the prospect of sharding in distributed ledgers, let us consider the overhead introduced by sharding.s is the base size (in memory) of a single transaction and ω is the overhead factor for each shard.
If we assume S shards, This indicates that, while sharding may distribute the load, it introduces an overhead that could affect the memory requirements.

Latency Incorporating Network Effects
Network latency plays a pivotal role in the time taken to calculate and update the accumulative weights of transactions.Let us define ϵ as the average network latency and ρ as the proportion of nodes that need to update their local Tangle copy for consensus, specifically to incorporate the changes in accumulative weight.
Given the importance of accumulative weight calculations, the latency L due to network effects for a transaction t can be represented as where α is a proportionality constant, denoting the time taken for the internal computation associated with accumulative weight calculations.β represents the time required for I/O operations, specifically for the reading and updating of the accumulative weights in the system.The accumulative weight mechanism, while fundamental in Tangle's design, introduces complexities in time, memory, and network latency.Tangle's scalability, while promising, encounters challenges both in time complexity and memory requirements as the system grows.Factors such as queue depths and sharding introduce variability that needs to be managed to maintain consistent performance.Network effects, especially in globally distributed systems, add another layer of complexity to the overall performance of the system.Proper protocols and optimizations are essential to ensure that Tangle remains viable for large-scale applications.

Challenges and Implications of Consensus in DAG-Based Systems
As Section 3.2.5 discusses, the role of the coordinator/validators in Tangle is pivotal.It serves as an interim solution to safeguard against double spending and potential vulnerabilities and ensure a single confirmed state of the ledger.Despite the coordinator's utility in ensuring the network's security, the overarching vision for IOTA is a coordinatorfree environment, such as in 'The Coordicide' [33] and 'Tangle 2.0 Leaderless' [4].This aspiration is not merely aimed at embracing decentralization in its purest form, but also at optimizing the system's efficiency, eliminating potential bottlenecks, and ensuring a homogenized validation process.While several approaches integrating voting and permitted validator committees exist and warrant deeper investigation, this study's focus remains on the coordinator.The appeal of DAG-based systems lies in their scalability and potential for genuine decentralization.However, the introduction of coordinators or permitted validator committees introduces significant complexities, posing challenges to their seamless implementation.

Potential System Overhead
If the coordinator/validator is seen as the final authority, the cumulative security efforts at each node, as previously described (Section 5.1), could be perceived as redundant.The original intent behind local weight calculations and validations was to democratize security, making every participant equally responsible for the network's security.If the coordinator/validator's decisions ultimately supersede all others, then the intrinsic value of these decentralized efforts might be negated, leading to system inefficiencies.

Heterogeneity in the DAG
By design, Tangle does not impose strict uniformity across its nodes.As the node convergence Section 4.3 suggests, each node in the IOTA network possesses a distinct local view of Tangle [34].While this design promotes decentralization and scalability, it presents challenges for a centralized entity like the coordinator.

Multiplicity of Validator and Consistency Challenges
The idea of employing multiple validators could potentially exacerbate the issue of network consistency.A solitary coordinator, while presenting a single point of failure or control, ensures that milestones are consistently recognized across the network.Introducing more validators, given the intrinsic heterogeneity of Tangle, means that each validator would have a disparate view of Tangle due to the asynchronous nature of transactions and validations.As a consequence, different validators might generate varying milestones or voting, possibly referencing diverse sets of transactions.This can lead to inconsistencies regarding which transactions are deemed 'confirmed', creating potential disparities in the perceived state of the ledger across nodes.

Future Work and Limitations
While the current study robustly covers the foundational protocols of Tangle, there remains substantial scope to extend this work.One primary avenue is the incorporation of malicious node activities, enabling a deeper exploration of potential vulnerabilities, the analysis of newer versions of Tangle, and the system's resilience against various attacks.Additionally, given our intention to establish a benchmark for discrete event simulation, it would be pertinent to integrate alternative tip selection algorithms.This would allow for a comprehensive understanding of DAG-based distributed ledger dynamics, transcending the confines of IOTA and providing insights into a broader spectrum of distributed ledgers.Furthermore, in our current implementation, we primarily employ a single coordinator for Tangle's operation.Looking ahead, we anticipate transitioning to a model with multiple validators.This transition will inherently introduce added complexities.Moreover, the potential for network latency becomes more pronounced with multiple active validators, which could impact Tangle's optimal performance.Future studies will also focus on conducting a comprehensive quantitative analysis of the cumulative weight growth and network effects, employing sophisticated mathematical and computational models to substantiate the observed phenomena.Navigating these complexities will be essential, but, by addressing them, we aim to further our understanding and optimization of DAG-based distributed ledger systems.

Conclusions
Through our detailed evaluation of the IOTA Tangle protocol, guided by its white paper, we delved into the complexities of DAG-based distributed ledgers.Utilizing a purpose-built discrete-event-based simulator, our analysis revealed that subtle intricacies within DAG architectures can be uncovered.
This comprehensive analysis encompasses the observation of the scalability of the random walk mechanism, its sensitivity to the walker count N, the impact of bias α on responsiveness, and the efficiency gains achieved with shorter time intervals W.These findings illuminate the foundational challenges and optimization opportunities within the protocol, providing a solid rationale for its evolution.Additionally, we have studied the autonomous node operations, guided by Poisson processes.This analysis underscores the crucial role of discrete event simulators in understanding the dynamics of such decentralized environments.Unlike previous studies that often assume a network's adherence to a Poisson distribution without empirical validation, our proposed simulator enables a quantitative examination of this distribution.We have demonstrated how individual node transaction generation rates λ i contribute to the aggregate transaction rate λ for the entire network, offering a more granular perspective.The comparisons suggest that while the results may not perfectly align with a pure Poisson distribution, they closely resemble a thinned version of it.
These mechanisms, while innovative in transaction confirmations, present challenges, especially as Tangle grows, introducing higher time complexity and memory demands.Potential solutions such as sharding, though promising, introduce their own overheads, which might impact system latency.As Tangle evolves, addressing these issues becomes crucial in ensuring seamless and consistent performance.
Moreover, our evaluation of the coordinator/validator highlights its indispensable role while underscoring potential challenges inherent to its function.Its protective role is undeniable, but its centralized nature poses a dichotomy in a decentralized system.
Looking ahead, we see numerous avenues for deeper exploration.Understanding malicious node behaviors, assessing system vulnerabilities, and weighing the benefits and disadvantages of different tip selection algorithms will be crucial.
In summary, our research underscores that DAG-based distributed ledgers hold significant potential due to their scalability, especially for IoT applications.However, addressing the challenges outlined in this study is crucial for the widespread adoption of this technology.Our study serves as a directional guide, elucidating the complexities of DAG-based systems and indicating potential possibilities for subsequent research and development.The extensive nature of our proposed simulator, with its comprehensive analysis capabilities, is a valuable asset in further investigating these challenges.Its application can provide deeper insights into the dynamics of DAG networks, aiding in the optimization of these systems for broader, practical deployment.

Figure 1 .
Figure 1.A visual representation of Tangle's directed acyclic graph.Each transaction is denoted by a rectangle containing its unique ID and its accumulative weight (AW).Transactions reference two preceding transactions, as indicated by the connecting edges.Green rectangles represent transactions that have been approved by milestones.Blue rectangles highlight transactions that, while not approved by a milestone, are referenced by other transactions.Lastly, the grey rectangles symbolize tips, which are transactions that have not yet been referenced by any subsequent transaction.

Figure 2
illustrates this comparison, showcasing histograms of inter-arrival times against fitted exponential distributions for three different nodes.While the results may not align perfectly with a pure Poisson distribution, they closely resemble a thinned version of the Poisson distribution, as described by Equation (6).

Figure 2 .
Figure 2. Inter-arrival times for different nodes over a specified simulation interval, demonstrating adherence to the Poisson distribution.

Figure 5 .
Figure 5. Weight update duration vs. transaction count: an analysis of computational overhead in accumulative weight calculations as a function of increasing transaction count.

Table 1 .
Comparison of DAG-based simulators in related work.
Our Work 2024Generic DAG, proposed generic implementation of DAG data structure can easily be extended to multiple DAG-based distributed ledgers; analyzes two tip selection methods, accumulative weight and Poisson distribution.

Table 2 .
Parameter settings for Poisson distribution analysis.
4.2.Dynamics and Efficiency of the Random Walk in IOTA's Tangle

Table 3 .
Observations of overlap and Jaccard similarity over time.