An Adaptive Delay-Tolerant Routing Algorithm for Data Transmission in Opportunistic Social Networks

: In opportunistic networks, the requirement of QoS (quality of service) poses several major challenges to wireless mobile devices with limited cache and energy. This implies that energy and cache space are two significant cornerstones for the structure of a routing algorithm. However, most routing algorithms tackle the issue of limited network resources from the perspective of a deterministic approach, which lacks an adaptive data transmission mechanism. Meanwhile, these methods show a relatively low scalability because they are probably built up based on some special scenarios rather than general ones. To alleviate the problems, this paper proposes an adaptive delay-tolerant routing algorithm (DTCM) utilizing curve-trapezoid Mamdani fuzzy inference system (CMFI) for opportunistic social networks. DTCM evaluates both the remaining energy level and the remaining cache level of relay nodes (two-factor) in opportunistic networks and makes reasonable decisions on data transmission through CMFI. Di ﬀ erent from the traditional fuzzy inference system, CMFI determines three levels of membership functions through the trichotomy law and evaluates the fuzzy mapping from two-factor fuzzy input to data transmission by curve-trapezoid membership functions. Our experimental results show that within the error interval of 0.05~0.1, DTCM improves delivery ratio by about 20% and decreases end-to-end delay by approximate 25% as compared with Epidemic, and the network overhead from DTCM is in the middle horizon.


Introduction
Opportunistic social networks [1] are a special combination of opportunistic networks and wireless multi-hop networks, in which clients carrying wireless mobile devices communicate with their peers in a communication area [1]. Without a stable and relay communication link in remote areas, data transmission in opportunistic social networks undergoes intermittent connection and delay, therefore the "store-carry-forward" strategy has also been used in this type of wireless networks [2]. As a special type of wireless multi-hop network, opportunistic networks (OPPNET) are derived from wireless Ad-Hoc networking and delay-tolerant networking [2]. At the dawn of 5G and big data, opportunistic networks are not only a special topological structure, but also a specific form of data routing and forwarding. The hot research areas on opportunistic networks include mobile model, routing algorithm, security mechanism, and application scenario [3]. Due to the intermittent connection and variable network structure of opportunistic social networks, the source node in unpredictable location attempts to communicate with the destination without an end-to-end path [4]. Consequently, the end-to-end communication connection between sources and destinations can be implemented by the opportunistic communication created by node mobility [5]. Recently, the primary application areas Table 1. The comparison of wireless opportunistic network routing protocols.

Routing Protocol Implement of Details
Prophet Frequency of encountering, cache space and contact duration BubbleRap Node community and centrality, limiting data duplicates SCORP Node sharing interest, number of encounters and contact duration, storage DABBER Energy consumption, storage and other routing metrics ICMD Cache management and node cooperation To this end, this paper proposes an adaptive two-factor routing algorithm using curve-trapezoid CMFI [26], namely DTCM. DTCM evaluates both the remaining energy level and the remaining cache level of relay nodes (two-factor), and then makes reasonable decisions on data transmission through CMFI. For the sake of comprehensively evaluating the impact of two-factor on data transmission, this paper proposes curve-trapezoid CMFI [27][28][29][30], which determines the fuzzy uncertain relationship between two-factor and data transmission and evaluates the fuzzy control result by the de-deblurring component. DTCM is an extensible algorithm because of its high adaptability. Data packages will be transmitted from message carriers to their future two hops based on the proposed two-hop feedback mechanism. Overall, the main contributions of this paper are summarized as follows: 1.
To analyze the fuzzy uncertain impact of the remaining energy and cache on data transmission, CMFI has been proposed. As an adaptive and extensible method for energy and cache, CMFI optimizes the selections of the next hop from neighbors via three fuzzy components. position dynamic energy algorithm, which develops an efficient communication link between the source node and the destination node in WSN, which is known as MDOR [37]. Energy evaluation will be analyzed in communication links. Kang and Chung [38] introduced a PRoPHET-based routing algorithm, which proposed that the end-to-end data transmission between source and destination could be developed base on the remaining battery and delivery predictability. They provide a comparison between PRoPHET [39] and PRoPHET with periodic sleep. The PRoPHET-based routing algorithm leverages the remaining battery to determine the delivery predictability and select suitable next hops.
For the application scenario of wireless social network, a routing algorithm can solve the problem of limited energy from the aspects of node similarity, community division, node centrality, and node trust. Lai et al. [40] developed a rotated-algebraic path-time codes (RA-PTC) routing algorithm to face the technological challenges in networks with energy-harvesting mobile devices, in which a complete path from source to destination can harvest enough energy for data transmission. Kulkarni et al. [41] recommended a privacy protection mechanism, which is constructed based on the trust value of mobile vehicles. The energy consumption of data routing and forwarding can be employed to evaluate the incentive scheme and trust value of nodes. Wei He [42] proposed an energy-saving routing approach based on K-means and FAH, namely the KAK routing algorithm. The objective of this algorithm is to handle the issue of node energy constraints, low delivery ratio, and sort network cycle, with the support of energy consumption reducing mechanism. Loreti, et al. [43] introduces a novel analytic model optimizing the number of discovered peers for a given energy budget, which aims to optimize the minimal network detection energy consumption by dynamically tracking the optimum even in non-stationary state. L. Bracciale et al. [44] constructed a neighbor discovery algorithm based on the duty cycle length to the time-varying context and contact duration. The goal of this approach is mainly to make a tradeoff between energy consumption and network performance.

System Model Design
The routing algorithm is a primary issue in the research field of opportunistic networks because of its particularity. The improvement of the transmission environment is reflected in increasing the delivery ratio and decreasing end-to-end delay in opportunistic networks. Recently, many routing algorithms have been proposed based on different application scenarios. However, most routing algorithms tackle the issue of limited network resources using a deterministic approach, which lacks an adaptive data routing mechanism [45]. Moreover, these methods show relatively low scalability because they are probably built up on the basis of some special scenarios rather than general ones. This will likely cause a low quality of service from networks to clients. On the other hand, when the cache of nodes is insufficient, they cannot store and process the data information obtained from message carriers [46]. Consequently, this paper comprehensively considers the significant two-factor including cache space and energy consumption through the fuzzy inference CMFI system.

The Frame of Adaptive Curve-Trapezoid Mamdani Fuzzy Inference and Decision-Making System
The limited energy and cache are two severe challenges for opportunistic networks. However, few routing algorithms consider the combined impact of the two factors on data transmission. Besides, most routing protocols tackle this issue from a deterministic perspective, which can be formalized as a linear or nonlinear mathematical expression. On the contrary, the relationship between the data transmission and two-factor (energy and cache) is in actuality a type of fuzzy uncertain relationship, while should be defined as the level of membership in a fuzzy system. Therefore, we propose an adaptive two-factor data transmission algorithm based on CMFI. As shown in Figure 1, a numerical comparison is an intuitive way for message carrier or source node to make decisions on data transmission, but it neither discovers the impact of variable change on data transmission, nor distinguishes the influence degree on data transmission between different factors. Therefore, the CMFI system has been considered as a significant approach to achieve the above two goals. In this section, we will introduce the overall structure of the CMFI step by step, as follows: Electronics 2020, 9,1915 5 of 27 2020, 11, x FOR PEER REVIEW 5 of 29 decisions on data transmission, but it neither discovers the impact of variable change on data transmission, nor distinguishes the influence degree on data transmission between different factors. Therefore, the CMFI system has been considered as a significant approach to achieve the above two goals. In this section, we will introduce the overall structure of the CMFI step by step, as follows:  Figure 1. The overall structure of the curve-trapezoid Mamdani fuzzy inference system.
Step 1: to accurately determine the pattern recognition of data transmission capability of nodes, CMFI system obeys the tripartite method rule from Mamdani fuzzy inference system [37,38], thereby developing three levels of membership functions (low membership function, medium membership function, and high membership function) and the corresponding three levels of fuzzy subsets for each fuzzy input.
Step 2: the nodes in opportunistic social networks are defined as message users who carry several mobile devices, and two-factor fuzzy input could be regarded as fuzzy factor set from CMFI. The message carriers collect two-factor fuzzy input from their neighbor nodes and compute membership degrees for each of fuzzy inputs based on three levels of membership functions (low, medium, and high). In addition, three levels of fuzzy subsets (low, medium, and high) are determined by the three levels of membership functions.
Step 3: with the support of 'If-Then' fuzzy rules [36], there are nine different combination that can be generated from fuzzy subsets between the first-fuzzy and second-fuzzy inputs. These nine various combinations reflect nine levels of data processing from nodes, respectively. Besides, neighbor nodes of message carriers are divided into three categories: active next-hop, potential nexthop, and under-resourced next-hop.
Step 4: Due to the specificity of two-factor fuzzy input, CMFI is employed to evaluate the impact of two-factor fuzzy input on the data transmission in opportunistic social networks. The abscissa and ordinate in each of the six membership degrees could produce a closed area with the corresponding level of membership function. The final control result can be acquired from the overlap of these six areas based on the centroid method [38].
Step 5: Through systematic and artificial parameter adjustment, CMFI can optimize the fuzzy control result, which provides support to node classification and to make data transmission more suitable and reliable in opportunistic social networks.
To this end, after the above five interconnected steps, message carriers can evaluate the ability of data storage and routing of neighbor nodes by calculating and comparing their fuzzy control result, thereby continuously optimizing node classification and the selection of the next-hop in the routing table of message carriers. Step 1: to accurately determine the pattern recognition of data transmission capability of nodes, CMFI system obeys the tripartite method rule from Mamdani fuzzy inference system [37,38], thereby developing three levels of membership functions (low membership function, medium membership function, and high membership function) and the corresponding three levels of fuzzy subsets for each fuzzy input.
Step 2: the nodes in opportunistic social networks are defined as message users who carry several mobile devices, and two-factor fuzzy input could be regarded as fuzzy factor set from CMFI. The message carriers collect two-factor fuzzy input from their neighbor nodes and compute membership degrees for each of fuzzy inputs based on three levels of membership functions (low, medium, and high). In addition, three levels of fuzzy subsets (low, medium, and high) are determined by the three levels of membership functions.
Step 3: with the support of 'If-Then' fuzzy rules [36], there are nine different combination that can be generated from fuzzy subsets between the first-fuzzy and second-fuzzy inputs. These nine various combinations reflect nine levels of data processing from nodes, respectively. Besides, neighbor nodes of message carriers are divided into three categories: active next-hop, potential next-hop, and under-resourced next-hop.
Step 4: Due to the specificity of two-factor fuzzy input, CMFI is employed to evaluate the impact of two-factor fuzzy input on the data transmission in opportunistic social networks. The abscissa and ordinate in each of the six membership degrees could produce a closed area with the corresponding level of membership function. The final control result can be acquired from the overlap of these six areas based on the centroid method [38].
Step 5: Through systematic and artificial parameter adjustment, CMFI can optimize the fuzzy control result, which provides support to node classification and to make data transmission more suitable and reliable in opportunistic social networks.
To this end, after the above five interconnected steps, message carriers can evaluate the ability of data storage and routing of neighbor nodes by calculating and comparing their fuzzy control result, thereby continuously optimizing node classification and the selection of the next-hop in the routing table of message carriers.

The First-Factor Fuzzy Input: Computation of the Remiaing Energy Level
In opportunistic social networks, the purpose of routing algorithm is to construct an end-to-end communication connectivity from the source to the destination. However, wireless mobile devices with limited energy are hardly able to implement the long-distance communication connectivity [47]. To this end, it is a challenge for routing algorithm to use the limited energy of nodes to route data packets. DTCM focuses on the relationship between the residual energy and data transmission, thereby constructing a novel routing strategy for opportunistic networks [37,38].
As shown in Figure 2, when node i moves to communication area of node j and communicates with node j , energy consumption is related to mobility distance and the number of data packets. To quantify the whole process, we fist define e (i,j) as the energy consumption from data communication between node i and node j . Since the communication between two nodes can only occur in the same communication area, node i needs to move to some locations that are covered by communication area of node j . Accordingly, energy consumption e (i,j) can be divided into two aspects of energy consumption: communication energy consumption and mobility energy consumption. Therefore, e com (i,j) and e mob (i,j) can be regarded as communication energy consumption and mobility energy consumption, respectively, and then the energy consumption e (i,j) between node i and node j can be computed by e (i,j) = e com (i,j) + e mob (i,j) . In opportunistic social networks, the purpose of routing algorithm is to construct an end-to-end communication connectivity from the source to the destination. However, wireless mobile devices with limited energy are hardly able to implement the long-distance communication connectivity [47]. To this end, it is a challenge for routing algorithm to use the limited energy of nodes to route data packets. DTCM focuses on the relationship between the residual energy and data transmission, thereby constructing a novel routing strategy for opportunistic networks [37,38].
As shown in Figure 2, when i node moves to communication area of j node and communicates with j node , energy consumption is related to mobility distance and the number of data packets. To quantify the whole process, we fist define Since the mobility energy consumption between a pair of nodes is related to the distance between them, we can calculate it from the perspective of movement and unit loss rate. On the other hand, the communication energy consumption rate is related to the amount of data that needs to be transmitted. Then, we denote x and y as the horizontal and vertical coordinate of relay node in two-dimensional geographical position, and define can be calculated by:  Since the mobility energy consumption between a pair of nodes is related to the distance between them, we can calculate it from the perspective of movement and unit loss rate. On the other hand, the communication energy consumption rate is related to the amount of data that needs to be transmitted. Then, we denote x and y as the horizontal and vertical coordinate of relay node in two-dimensional geographical position, and define e mob unit as the energy consumption per unit distance, so e mob (i,j) and e com can be calculated by: where x i , x j , y i , and y j are the horizontal and vertical coordinates of node i and node j , respectively. As shown in Figure 1, re ( j,i) and te ( j,i) are the energy consumption rate of node j receiving data from node i and the energy consumption rate of node j transmitting data to node i , respectively. Message carriers select a next hop based on the energy consumption from the communication between neighbor nodes and the destination node. Since the energy consumption of each neighbor node and destination node is different, the probability of each neighbor node being selected as a relay node is also different, and this probability can be calculated as: where node j is a neighbor node of node i and R is a set of relay nodes that have been selected by node j to forward data packets to the destination node v . Only when P( j, v) from node j is above those from other neighbor nodes, node j will it be regarded as a reliable relay node and node i will forward data packets to it. Through the evaluation of energy consumption and selecting probability, the energy consumption level of mobile node can be defined as the ratio of the energy consumption of the current node to the energy consumption of all neighboring nodes. To this end, L c (node i ) from node j can be expressed as: in which R is the se of neighbor nodes of node j .
To maintain the stability of data transmission environment in the networks, we set a lower bound on the remaining energy level of mobile nodes in the network. When the remaining energy level of mobile node is equal to or below this lower bound, the node needs to replenish energy and cannot participate in the routing and transmission of messages. Therefore, the remaining energy level L re (node j ,∆t) form node j (as shown in Equation (1)) can be updated as: where L e min is the lower bound of the remaining energy level from mobile nodes in networks and L e ini is the initial energy level of mobile nodes. Since the nodes move around randomly in the same communication area and constantly communicate with each other, they update their remaining energy level L re (node j ,∆t) at different time stamps ∆t. Because of L e ini > L e min , the above Equation (5) holds that L re (node j ,∆t) ∈ (0, 1].

The Second-Factor Fuzzy Input: Computation of the Remiaing Cache Level
Congestion control is one of the most important issue in opportunistic networks because of the limited cache space of mobile devices. In general, there are two mainstream ways to effectively implement congestion control: limiting the number of message copies to avoid generating unnecessary packets and deleting unnecessary packets in a timely manner. However, message carriers decide whether to forward data information to its neighbor nodes by analyzing their storage capacity. To this end, the selection of the next hop node is determined by evaluating the membership level of cache space of neighbor nodes. This is equivalent to the phenomenon of congestion can be effectively avoided once the relationship between data information and cache space is determined.
To avoid data congestion in the data transmission process, DTCM algorithm takes into account the average size of data packets that need to be transmitted, the number of these data packets, and the remaining cache spaces of nodes. We first define CR (node i ) as the residual cache ratio, which is the ratio between the remaining cache spaces of node i and the size of the data packets transmitted to node i , and it is formalized as: Electronics 2020, 9, 1915 8 of 27 where CS rem (node i ) is the remaining cache spaces of node i and u represents the total number of messages that received by node i . Moreover, nu s denotes the number of data packets in the message s, and si s is the average size of each data packet in the message s. When CR (node i ) ≥ 1, it means that node i has enough cache spaces to store the data group, and data congestion can be effectively avoided, so node i can be considered as a candidate for the next hop; otherwise, data congestion may occur in cache spaces of node i , and message copies will not be forwarded to node i .
With the aim of comparing the storage capacity between each neighbor node, we convert the residual cache ratio CR (node i ) to the remaining cache level L c (node i ) , which can be expressed as: where R is the se of neighbor nodes of node i . The larger the value of L c (node i ) , the greater the storage capacity from node i than that from other nodes in the same communication area. Also, to maintain the stability and sustainability of the data transmission process, we set a lower bound on the remaining cache level of mobile nodes L c min in the network, and the remaining cache level can be updated to: where ∆t represents the time stamp for updating the remaining cache level. Equation (A1) satisfies L re (node j ,∆t) ∈ (0, 1] when L c (node c ) > L c min , which indicates that the cache capacity of the node i at ∆t is greater than the average cache space horizon of network. Otherwise, L re (node j ,∆t) > 1 means node i is not a reliable next hop in the network, with insufficient caching capacity. In other words, data congestion can be effectively controlled in DTCM algorithm by two assumptions: CR (node i ) ≥ 1 and L re (node j ,∆t) > 1.

Computation of Transmission Evaluation Value Based on CMFI
It is well known that most routing algorithms in opportunistic networks tackle the issue of data transmission from the perspective of deterministic problems, which can be formalized as linear or nonlinear mathematical expression. However, the relationship between data transmission and two-factor (energy and cache) is actual a type of fuzzy uncertain mapping, while should be defined as the level of membership in a fuzzy system. In this section, we consider the issue as the membership relationship between data-transmission capability of nodes and their two-factor fuzzy input (the remaining cache and remaining energy). The Mamdani fuzzy inference system [13,23] will be regarded as the model used in the proposed algorithm due to its high extensibility and applicability, and it consists of three components: the fuzzy component, the fuzzy inference component, and the De-Blurring component [13,24,25]. We introduce the rough process of each component in this section and the implement details of CMFI will be described in Appendix A.

The Fuzzy Component
The objective of the inference component is to determine the membership degree and fuzzy subset of nodes. With the support of the method of tripartition, we develop the fuzzy component by three levels of membership functions (low, medium, and high). In general, a fuzzy input can generate three levels of membership degrees and each of them corresponds to one level of fuzzy subset (low, medium and high). There are many distribution models that can be adopted in several application scenes in the control engineering theory, such as the normal distribution, the F distribution and the parabolic distribution, etc. [13,21] As the remaining energy or cache is below a certain lower limit, the node may be dead or unable to move; also, these two values must be below a network adaptable upper limit, so the two-factor fuzzy inputs follow a half trapezoid or trapezoid distribution. To this end, curve trapezoid membership function will be utilized to compute the membership degree of the two-factor fuzzy inputs [23,25] in the proposed improved fuzzy system, namely the Curve-trapezoid Mamdani Fuzzy Inference System (CMFI). After the process of the fuzzy component, each of two-factor fuzzy inputs (energy or cache) can generate three fuzzy degrees and fuzzy subsets.

The Fuzzy Inference Component
According to 'If-Then' fuzzy rules [36], there are nine different combinations that can be generated from the fuzzy subsets between the first-fuzzy and second-fuzzy inputs (the remaining cache and remaining energy). These nine various combinations represent nine different levels of processing data packages. Moreover, neighbor nodes can be classified as three different categories based on the two-condition nine-rule fuzzy inference logic [37,38], where active, potential and under-resourced relay nodes correspond to high (levels 1, 2, and 3), medium (levels 4, 5, and 6), and low (levels 7, 8, and 9) membership levels, respectively. The active relay nodes are the users that move frequently and have high information processing capacity with high energy and cache priority [20,23]. On the contrary, under-resourced nodes could be considered as dead users due to their low remaining energy and cache space. Potential relay nodes may be regarded as reliable next hop users in the future. Accordingly, the source node will select those active rely nodes as its next hop to route and forward data information through the dynamic fuzzy evaluation of data processing capability. This is only a preliminary fuzzy assessment, and the final decision-making requires digital metrics, which can be determined from the De-Blurring component.

The De-Blurring Component
The proposed curve trapezoid Mamdani fuzzy inference system uses the maximum membership principle [13] to compute the corresponding membership degree, which can also be expressed as the method of "first OR then AND" operation [22]. The abscissa and ordinate in each of membership degree can produce a closed area with the corresponding level of membership function. Obeying the mathematics expression of De-Blurring theory [38], we adopt the centroid of the overlap of these six areas to represent the fuzzy control result for two-factor fuzzy inputs. The fuzzy control result is applied to optimize the classification of neighbor nodes, by which the source node or message carrier is able to make a suitable decision on data routing or forwarding in the data transmission process in opportunistic social networks.

Data Routing Based on Two-Hop Feedback Mechanism
In opportunistic networks, most traditional routing algorithms only consider the selection of the next hop in the data transmission process. On the contrary, the proposed algorithm in this paper pays attention to not only the next hop but also to the second hop [36,37], which obtains a continuity of data transmission. We called it "two-hop feedback mechanism" in this paper.
In general, message carriers in routing algorithms of opportunistic networks find a suitable relay node just from the neighbor nodes, while the two-hop feedback mechanism proposes that message carriers seek those nodes with a higher transmission evaluation value than that of themselves as the next hop [43]. As illustrated in Figure 3, the implementation process of the two-hop feedback mechanism is based on two significant assumptions: Assumption 1. The node 4 is in two intersecting communication areas simultaneously and will not leave the two communication areas for a very short time. Mobile devices in opportunistic networks communicate with each other by wireless short-range signals, so each of them have different sizes of communication coverage. This also opens up the possibility of this premise and node 4 can acquires information from two communication area.

Assumption 2.
The node 4 has enough processing and storage capacity, which could be a laptop, a vehicle. It is an ambassador between two communications and is responsible for collecting and distributing data associated with the final control result of each node. In practice, the amount of data that needs to be transmitted is small, because message carrier node 1 only needs the value of the final control evaluation from each node. because message carrier 1 node only needs the value of the final control evaluation from each node.  . The green arrows represent gathering information related with the transmission evaluation value, while the red arrows are the process of routing and forwarding messages. According to the above assumptions, the process of two-hop feedback mechanism transmission can be implemented as follows: Step 1: Each node determines its own final control evaluation value FC and analyzes the membership level of fuzzy inputs on data transmission.
Step 2: The  From Figure 3, ∆t 1 and ∆t 2 are two very close time stamps. The source node 1 and its neighbor nodes node 2 , node 3 , node 4 , node 5 are located in the same communication area at the time stamp ∆t 1 , while node 4 , node 6 , node 7 , node 8 , node 9 , and node 10 stay in another communication area at the time stamp ∆t 2 . The green arrows represent gathering information related with the transmission evaluation value, while the red arrows are the process of routing and forwarding messages. According to the above assumptions, the process of two-hop feedback mechanism transmission can be implemented as follows: Step 1: Each node determines its own final control evaluation value FC and analyzes the membership level of fuzzy inputs on data transmission.
Step 2: The node 4 collects the final transmission evaluation value FC from its neighbor nodes node 6 , node 7 , node 8 node 9 , and node 10 , then forwards these data information to message carrier node 1 .
Step 3: The message carrier or the source node 1 collects the final transmission control result from its neighbor nodes node 2 , node 3 , node 4 , and node 5 as well as the second-hop nodes node 6 , node 7 , node 8 , node 9 , and node 10 .
Step 4: The node 1 compares the collected final transmission result with its own value and optimizes the classification of nodes (active, potential and under-resourced relay nodes).
Step 5: Because FC node 4 > FC node 1 , FC node 7 > FC node 1 , and FC node 9 > FC node 1 , the source node node 1 only transmits data to its neighbor node 4 , and then sends messages to node 7 and node 9 , the messages can be forwarded to the destination eventually through the two-hop feedback mechanism.
Theoretically, the node with a larger final transmission control value has the greater capacity to process and store data information. The network performance can be further improved if the data packet is processed by these nodes. To this end, if data information is lost, cannot be collected or there is no node's transmission control value that is greater than message carrier, the node 1 will keep message duplicates in its own cache space and find active relay nodes by randomly moving to another communication area. Since the amount of data associated with transmission control value that needs to be transmitted is extremely small, the relevant data can be collected promptly, even in the dynamical topology structure of opportunistic networks.

Algorithm Complexity Analysis
For the analysis of expandability and applicability, Algorithm 1 is reasonably developed to analyze the complexity of the proposed DTCM algorithm. From Algorithm 1, during the process of information collecting and updating, the nodes transmit state information queue to the peers they encounter, so the time complexity of this process is O(n). Additionally, each node adopts two-factor fuzzy inputs to determine transmission evaluation value in the second loop, therefore, the time complexity of this process is O(n). Eventually, the time complexity of two-hop feedback mechanism is O(log 2 n). To this end, the overall time complexity of DTCM algorithm is O(n + n + log 2 n) = O(n). Compared with the time complexity of the DTCM algorithm, the time complexity of FCNS algorithm is O(n), and those from Spray and Wait [47] and Epidemic algorithms [48] are O(n log n) and O(n), respectively. Algorithm 1. Data routing and forwarding in a delay-tolerant routing (DTCM) algorithm based on the (curve-trapezoid Mamdani fuzzy inference) CMFI system. Input: the source node s , neighbor nodes node 1 · · · · · · node n , and the destination node d Output: FC node 1 , FC node 1 · · · · · · FC node n 1 Begin 2 For node 0 collects state queue information SQ (node i ) from node i ; 4 node i collects state queue information SQ (node i+1 ) from node i+1 ; 5 End for 6 For computing their own two-factor fuzzy input L c (node i ) and L re (node i ) ; 8 determining the fuzzy evaluation subset FM(x node i j ) and membership degree µ assessing the transmission evaluation value FC node i ; 10 output FC node 1 , FC node 1 · · · · · · FC node n ; 11 End for 12 If the source node 0 forwards messages to its neighbor node i ; 15 the relay node node i transmits messages to its neighbor node j ; 16 End if 17 End if 18 End

Simulation
In this section, we conduct experimental studies in a realistic network mobility model where all the theoretical assumptions of the proposed algorithm do not hold, which provides a fair comparison among it and the compared protocols. The simulation not only illustrates the improved performance under DTCM algorithm, but also provide insight in the suitable selection of network paraments such as the remaining level of network energy L e min and the remaining level of network cache L c min , etc. In this simulation, we first evaluate the energy and cache assumption of DTCM algorithm via a benchmark protocol, and then give a fair comparison between five routing protocols under the same experimental environment. The result demonstrates that the proposed protocol outperforms other ones in terms of delivery ratio and end-to-end delay within a reasonable fault-tolerant interval.

Introduction of Benchmark and the Compared Protocols
In opportunistic networks, there are several paradigms of the mobile model that can be adopted to demonstrate the real mobile characteristics of peopel. To evaluate the resource consumption of DTCM algorithm, a suitable candidate, namely Epidemic [48], which heavily relies on resource supply, can be chosen as the benchmark protocol in this simulation. Moreover, to fully reflect the performance of DTCM algorithm, it will be compared against other four protocols: FCNS (a fuzzy routing algorithm using comprehensive node similarity) [13], P&F (Predict and Forward routing algorithm) [18], Spray and Wait [47], and Epidemic [48]. Performance comparison among the five algorithms is summarized at length, mainly focusing on three aspects: delivery ratio, end-to-end delay, and network overhead. To comprehensively evaluate the details of implementation and performance of each approaches, this paper develops Table 2 to demonstrate the significant differences among the five routing algorithms, from a perspective of theoretical analysis. This comparison provides an effective benchmark for the above five routing algorithms when data congestion has been controlled. Table 2. The details of implementation of the compared five algorithms.

Routing Algorithm
Details of Implementation FCNS Mamdani fuzzy inference system for node similarity; assessment of node attributes; two-hop transmission.

P&F
Two phases: prediction of meeting probabilities and forwarding of data packets; optimal stopping theory.

Spray and Wait
Nodes constantly exchange information copies when they meet; limiting the number of data packets.

Epidemic
Flooding model: information groups are transmitted to all neighbor nodes in same communication area.

DTCM
Curve-trapezoid fuzzy inference system for cache and energy; node classification; two-hop transmission.

Experimental Parameter Setting
To provide a fair comparison, we give the same experimental setup to the five routing algorithms and the only difference is the method of selecting the next suitable hop. HCMM (Health Capability Maturity Model), which has been well known as a real community-based human mobile scenario, is used to test the performance of those algorithms. We allow Spray and Wait and Epidemic to implement the data transmission across node communities, because they do not rely on the node community. Opportunistic Network Environment (ONE) is regarded as a tool package to conduct this simulation. Furthermore, the other common experimental parameter setup is demonstrated in Table 3.
It is worth noting that this experiment consists of 16 groups and each group conducts the tests 5 times. In the HCMM model, each node is assigned a community, and the nodes tend to the community they are interested in. The nodes in the same community are randomly distributed, and the communication between them is more frequent than the nodes from two different communities. Overall, HCMM simulates a real social scene in which nodes move randomly between communities. Additionally, on the basis of general performance indexes from ONE platform, this experiment mainly focuses on the following five indicators:

1.
Energy consumption: this parameter represents the total energy consumption from the network, including node mobility, data transmission, task processing, etc.

2.
The remaining cache: this parameter indicates that the size of the remaining cache space of a node after it stores enough data grouping.

3.
Delivery ratio: this indicator represents the ratio of the number of data packages accepted by destination to the total number of packages sent from message carriers. This indicator is calculated by DR = PA/PS, in which PA is the number of messages from all successful data transmission and PS is the number of messages sent from all mobile nodes.

4.
Average end-to-end delay: this definition includes the delay from data routing, node waiting, and data transmission in a successful data transmission between a pair of mobile nodes. It is formalized as AD = TD/m, in which TD represents the total network delay and m is the number of data deliveries.

5.
Network overhead: this indicator denotes the total network overhead from the whole data transmission process among all nodes in the communication area, in which routing and forwarding delay, energy consumption, and memory usage are included.  Figure 4 plots the total network energy consumption between five routing algorithms for a various number of nodes. As seen in Figure 4, the total energy consumption is proportional to the number of nodes, which is due to the fact that energy assessment mechanisms weed out nodes whose energy levels are below the average level. Since the increasing number of nodes indicates more relay nodes involved in the data transmission process, the total energy consumption consistently and sharply rises up. As a benchmark protocol, Epidemic shows a highest energy consumption (35,000) as the number of nodes increase from 150 to 400. The flooding model adopted in Epidemic is able to realize the frequent data exchange among nodes and maximize the data grouping, which leads to a relatively high energy consumption. DTCM adopts the fuzzy inference system to evaluate the residual energy of nodes and ensure that the next hop has enough energy to process data information, so the energy consumption under DTCM is the lowest among the five routing algorithms. The implementations of P&F and FCNS do not consider the energy control and are based on a complex structure, so the total energy consumption from them is at a mid-horizon as the number of nodes increase from 150 to 400. The Energy consumption from Spray and Wait is slightly superior to that from Epidemic because of the flooding strategy with the control of the number of data packages.  Figure 5 shows that the comparison on the average remaining cache for per node among the five routing protocols when given the initial cache space is 35 Mb. As shown from Figure 5, the benchmark protocol (Epidemic), FCNS, and P&F are inferior to DTCM in terms of this evaluation parameter as the number of nodes increase from 50 to 400. This is because Epidemic, FCNS, and P&F fail to judge whether the node has enough cache space to store data grouping and even allows all message carriers to transmit data grouping to any node they encounter. On the contrary, DTCM adopts a type of cache  Figure 5 shows that the comparison on the average remaining cache for per node among the five routing protocols when given the initial cache space is 35 Mb. As shown from Figure 5, the benchmark protocol (Epidemic), FCNS, and P&F are inferior to DTCM in terms of this evaluation parameter as the number of nodes increase from 50 to 400. This is because Epidemic, FCNS, and P&F fail to judge whether the node has enough cache space to store data grouping and even allows all message carriers to transmit data grouping to any node they encounter. On the contrary, DTCM adopts a type of cache assessment method to analyze the remaining cache spaces of each node and then select the node with the enough cache spaces as the next hop. Moreover, the remaining cache level L c min sets a lowest bound for all nodes in the networks, due to which DTCM shows a higher average remaining cache than Spray and wait when the number of nodes increase from 150 to 400. However, when the number of nodes is between 50 to 150, their sparse distribution results in a declines of the frequency of information exchange, and also Spray and Wait has controlled the number of data packages, therefore, the remaining energy consumption from it lower than that from DTCM during this period.   Figure 5 shows that the comparison on the average remaining cache for per node among the five routing protocols when given the initial cache space is 35 Mb. As shown from Figure 5, the benchmark protocol (Epidemic), FCNS, and P&F are inferior to DTCM in terms of this evaluation parameter as the number of nodes increase from 50 to 400. This is because Epidemic, FCNS, and P&F fail to judge whether the node has enough cache space to store data grouping and even allows all message carriers to transmit data grouping to any node they encounter. On the contrary, DTCM adopts a type of cache assessment method to analyze the remaining cache spaces of each node and then select the node with the enough cache spaces as the next hop. Moreover, the remaining cache level min c L sets a lowest bound for all nodes in the networks, due to which DTCM shows a higher average remaining cache than Spray and wait when the number of nodes increase from 150 to 400. However, when the number of nodes is between 50 to 150, their sparse distribution results in a declines of the frequency of information exchange, and also Spray and Wait has controlled the number of data packages, therefore, the remaining energy consumption from it lower than that from DTCM during this period.

Performance Comparison between the Five Algorithms
In this section, we provide a fair comparison on the performance of the five algorithms based on the same simulation environment, and analyze the distribution of error, standard deviation, etc. To this end, the accuracy of experimental results can be guaranteed.
As shown in Figure 6, the delivery ratio from the Epidemic algorithm is lower than that from the other algorithms all the time. The reason for that is Epidemic is based on the flooding model, under which massive data groups lead to network congestion and performance degradation. The delivery ratios from P&F and Spray and Wait are at the medium horizon all the time. This is because

Performance Comparison between the Five Algorithms
In this section, we provide a fair comparison on the performance of the five algorithms based on the same simulation environment, and analyze the distribution of error, standard deviation, etc. To this end, the accuracy of experimental results can be guaranteed.
As shown in Figure 6, the delivery ratio from the Epidemic algorithm is lower than that from the other algorithms all the time. The reason for that is Epidemic is based on the flooding model, under which massive data groups lead to network congestion and performance degradation. The delivery ratios from P&F and Spray and Wait are at the medium horizon all the time. This is because they utilize a two-phase mechanism to control the number of message duplicates and then to avoid data congestion. 2020, 11, x FOR PEER REVIEW 16 of 29 they utilize a two-phase mechanism to control the number of message duplicates and then to avoid data congestion. As we seen in Figure 6, THE delivery ratio from DTCM is approximately 0.75 as the cache increases from 10 to 35 Mb. DTCM evaluates the ability of each node to process messages through cache and energy and adopts a comprehensive fuzzy evaluation method to determine the special relationship between the two-factor and data transmission. More reliable relay nodes participate in As we seen in Figure 6, THE delivery ratio from DTCM is approximately 0.75 as the cache increases from 10 to 35 Mb. DTCM evaluates the ability of each node to process messages through cache and energy and adopts a comprehensive fuzzy evaluation method to determine the special relationship between the two-factor and data transmission. More reliable relay nodes participate in the data transmission process, so the total number of hops from source to destination decreases and the delivery ratio is improved. When the cache space is 15 Mb, the delivery ratio from FCNS is higher than that from DTCM, because FCNS improves the selection of relay nodes through their similarity. However, FCNS cannot handle the data congestion when the size of cache space increases from 15 to 30 due to its simple structure. Figure 7 plots distribution of errors from experiments on delivery ratio comparison. From Figure 7, it is shown that those errors obey an approximate normal distribution and converge to 0.078 with the increases of the duration of the experiment. This is an acceptable fault tolerance interval of 0.05~0.1.  Figure 6. Delivery ratio comparison from five algorithms with various cache spaces.
As we seen in Figure 6, THE delivery ratio from DTCM is approximately 0.75 as the cache increases from 10 to 35 Mb. DTCM evaluates the ability of each node to process messages through cache and energy and adopts a comprehensive fuzzy evaluation method to determine the special relationship between the two-factor and data transmission. More reliable relay nodes participate in the data transmission process, so the total number of hops from source to destination decreases and the delivery ratio is improved. When the cache space is 15 Mb, the delivery ratio from FCNS is higher than that from DTCM, because FCNS improves the selection of relay nodes through their similarity. However, FCNS cannot handle the data congestion when the size of cache space increases from 15 to 30 due to its simple structure. Figure 7 plots distribution of errors from experiments on delivery ratio comparison. From Figure 7, it is shown that those errors obey an approximate normal distribution and converge to 0.078 with the increases of the duration of the experiment. This is an acceptable fault tolerance interval of 0.05~0.1. Then, the performance evaluation comparison between the five algorithms in terms of average end-to-end delay is exhibited in Figure 8. As the cache space increases from 10 to 35 Mb, average endto-end delay from Epidemic, FCNS, and P&F algorithms is higher than that from DTCM and Spray and Wait. Due to the small amount of data transmission, the delay from Spray and Wait is closest to the optimal. Then, the performance evaluation comparison between the five algorithms in terms of average end-to-end delay is exhibited in Figure 8. As the cache space increases from 10 to 35 Mb, average end-to-end delay from Epidemic, FCNS, and P&F algorithms is higher than that from DTCM and Spray and Wait. Due to the small amount of data transmission, the delay from Spray and Wait is closest to the optimal. 2020, 11, x FOR PEER REVIEW 17 of 29 As a type of two-phase scheme, P&F involve a special period that is known as "wait stage", in which a long waiting for destination will increase the end-to-end delay. Epidemic demonstrates an average end-to-end delay of about 140 min because of a large of message duplicates in the networks. As shown in Figure 8, the fuzzy inference system in DTCM is helpful in reducing the influence of independent variable change on decision uncertainty. To this end, the average end-to-end delay from DTCM remains a relatively low level all the time. FCNS is significantly inferior to FCNS in term of average end-to-end delay. On the one hand, energy and cache space are two of the most significant metrics to be considered in the data transmission process. On the other hand, the structure of curvetrapezoidal distribution adopted by CMFI is simpler than that of the normal distribution used in FCNS, which meets the requirements of low computing and low energy consumption in most mobile devices. Overall, average end-to-end delay from DTCM decreases nearly 25% than that from FCNS. In Figure 9, the dots represent the distribution of standard deviations from multiple experiments, and the horizontal line represents the mean of these standard deviations. The mean of standard deviation from DTCM and Spray and Wait are about 0.8 and 0.9, respectively. The distribution of standard deviation from the other three algorithms is sparser, which means the accuracy of the experiments on DTCM and Spray and wait is higher than those on the other three algorithms. As a type of two-phase scheme, P&F involve a special period that is known as "wait stage", in which a long waiting for destination will increase the end-to-end delay. Epidemic demonstrates an average end-to-end delay of about 140 min because of a large of message duplicates in the networks.
As shown in Figure 8, the fuzzy inference system in DTCM is helpful in reducing the influence of independent variable change on decision uncertainty. To this end, the average end-to-end delay from DTCM remains a relatively low level all the time. FCNS is significantly inferior to FCNS in term of average end-to-end delay. On the one hand, energy and cache space are two of the most significant metrics to be considered in the data transmission process. On the other hand, the structure of curve-trapezoidal distribution adopted by CMFI is simpler than that of the normal distribution used in FCNS, which meets the requirements of low computing and low energy consumption in most mobile devices. Overall, average end-to-end delay from DTCM decreases nearly 25% than that from FCNS. In Figure 9, the dots represent the distribution of standard deviations from multiple experiments, and the horizontal line represents the mean of these standard deviations. The mean of standard deviation from DTCM and Spray and Wait are about 0.8 and 0.9, respectively. The distribution of standard deviation from the other three algorithms is sparser, which means the accuracy of the experiments on DTCM and Spray and wait is higher than those on the other three algorithms.
the horizontal line represents the mean of these standard deviations. The mean of standard deviation from DTCM and Spray and Wait are about 0.8 and 0.9, respectively. The distribution of standard deviation from the other three algorithms is sparser, which means the accuracy of the experiments on DTCM and Spray and wait is higher than those on the other three algorithms.  Figure 10 illustrates performance comparison in term of network overhead. The network overheads from FCNS and P&F are higher than those from the other three methods, because they consume more processing and storage resources with a higher complexity. The network overhead  Figure 10 illustrates performance comparison in term of network overhead. The network overheads from FCNS and P&F are higher than those from the other three methods, because they consume more processing and storage resources with a higher complexity. The network overhead from DTCM always keeps medium horizon among those from the five routing methods, with an average value of 400. As the cache space increases from 10 to 35 Mb, each node under DTCM can store, process, and forward more information, and network resources are constantly consuming. To this end, the network overhead from DTCM rises from 325 to 450, and it reaches the maximum as cache space is 35 Mb. As we can see in the Figure 10, the network overhead from FCNS rises slowly. When the cache space is 20 Mb, the node can obtain more information about network state, and network overhead is at the lowest value of 320, because DTCM uses an adaptive method to reduce energy consumption and avoid data congestion. With a simple structure and the flooding model, Spray and Wait and Epidemic need less network resources to transmit messages, so they demonstrate a relatively low horizon of network overhead. Figure 11 shows the distributions of mean and error of network overhead from DTCM. As we can see, the error from the experiments as the parameters of the cache are 10, 15, and 20 Mb higher than those when the parameters of the cache are 25, 30, and 35. In other words, with the situation of a larger setting for cache, the experimental results have a greater reference value.
a relatively low horizon of network overhead. Figure 11 shows the distributions of mean and error of network overhead from DTCM. As we can see, the error from the experiments as the parameters of the cache are 10, 15, and 20 Mb higher than those when the parameters of the cache are 25, 30, and 35. In other words, with the situation of a larger setting for cache, the experimental results have a greater reference value. To sum up, through several experiments and different parameter settings, we comprehensively compare the performance of the five routing algorithms from three indicators. Moreover, we analyze the error, mean, and standard deviation of these experiments. Our experimental results demonstrate that within the error interval of 0.05~0.1, DTCM improves delivery ratio by nearly 20% and decreases end-to-end delay by approximate 25% as compared with the Epidemic algorithm, and the network overhead from DTCM is in the middle horizon. of network overhead from DTCM. As we can see, the error from the experiments as the parameters of the cache are 10, 15, and 20 Mb higher than those when the parameters of the cache are 25, 30, and 35. In other words, with the situation of a larger setting for cache, the experimental results have a greater reference value. To sum up, through several experiments and different parameter settings, we comprehensively compare the performance of the five routing algorithms from three indicators. Moreover, we analyze the error, mean, and standard deviation of these experiments. Our experimental results demonstrate that within the error interval of 0.05~0.1, DTCM improves delivery ratio by nearly 20% and decreases end-to-end delay by approximate 25% as compared with the Epidemic algorithm, and the network overhead from DTCM is in the middle horizon. To sum up, through several experiments and different parameter settings, we comprehensively compare the performance of the five routing algorithms from three indicators. Moreover, we analyze the error, mean, and standard deviation of these experiments. Our experimental results demonstrate that within the error interval of 0.05~0.1, DTCM improves delivery ratio by nearly 20% and decreases end-to-end delay by approximate 25% as compared with the Epidemic algorithm, and the network overhead from DTCM is in the middle horizon.

Discussion
It might confound us, as to what would be the performance of DTCM without fuzzy inference system and why the fuzzy inference system plays such an important role in DTCM. From the perspective of theoretical analysis, since the complex relationship between influence factor and data transmission is not a type of mathematical mapping, and cannot be simply formalized as a deterministic or stochastic function. However, fuzzy inference can compute the membership degree of different influencing factors and determine their membership level, thereby judging whether the selection on data routing is reasonable or not. On the other hand, from the application level, without a fuzzy inference system, DTCM makes decisions on data routing through the amount of the remaining energy and cache space, but it is not clear how much impact they have on data transmission and which one presents a greater influence. Dynamic weight adjustment and the optimization of node classification are difficult to achieve. As the number of nodes increases, decisions on data routing from the DTCM will be more and more unreliable, so this type of algorithm lacks extensibility and self-adaptability.
Another issue that needs to be discussed is the energy level of mobile devices in the real world. DTCM is a type of linear optimization algorithm that can be processed in polynomial time, which indicates that it also can be implemented in mobile devices (mobile phone, laptop, etc.) with limited energy. Meanwhile, the mobility model, data storage, and energy consumption set in this experiment all mimic the real application scenario. To this end, DTCM shows a certain degree of extensibility and applicability in real application scenarios.

Conclusions
This paper proposes a type of adaptive delay-tolerant routing algorithm using a curve-trapezoid Mamdani fuzzy inference system in opportunistic social networks (DTCM). Different from the latest algorithms, DTCM brings in the mechanism of fuzzy inference to establish the process of data routing and transmission. DTCM investigates the combined impact of remaining energy and cache space on data transmission from the perspective of the fuzzy uncertain method. The objective of the DTCM is to determine the fuzzy mapping from the two-factor to data transmission and to optimize the results of node classification with manual and systematic parameter adjustment. Another vital innovation in this paper is that an improved two-hop feedback mechanism is proposed, in which data can be transmitted from message carriers to their future two hops based on the membership level of fuzzy input. The experimental results demonstrate that within a tolerable error interval of 0.05~0.1, DTCM improves the delivery ratio by nearly 20% and decreases the end-to-end delay by about 25% as compared with Epidemic, and the network overhead from DTCM is in the middle horizon.
Author Contributions: S.C., Z.C., J.W. and K.L. conceived the idea of the paper. S.C. and K.L. designed and performed the experiments; S.C. and K.L. analyzed the data; Z.C. and J.W. contributed reagents/materials/analysis tools; S.C. and K.L. wrote the paper; S.C., Z.C., J.W. and K.L. revised the paper. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. The Details of Curve-Trapezoid Mamdani Fuzzy Inference System
The remaining cache and energy evaluate the capacity of a node to store and process information shared from message carriers. However, these two factors are also influenced by other elements, such as the total number of tasks in the node, the way to supply the energy or the caching protocol of the node. To this end, the levels of remaining cache and remaining energy may change over time. If the transmission evaluation value between nodes is computed via the constantly variational index, the message carrier may make an inaccurate data transmission decision. Accordingly, we construct the transmission evaluation value by adopting a strategy of fuzzy inference, which assesses the special relationship between the two factor and data transmission from the perspective of the level of ambiguity, rather than the determined value, and more importantly, greatly reduces the interference of other uncertain factors to the final data transmission decision. A fuzzy inference system can be developed to evaluate the impact of two-factor on data transmission and to compute the transmission evaluation value. The Mamdani fuzzy inference system [13,23] will be regarded as the system model used in the proposed algorithm due to its high extensibility and applicability, and it includes three components: a fuzzy component, a fuzzy inference component, and a De-Blurring component [13,24,25]. Next, we will introduce the construction process of each component proposed in this paper in detail.

. The Fuzzy Component
As a regular component in the Mamdani fuzzy inference system, the fuzzier determines the specific membership degree of the two-factor fuzzy input (two-factor fuzzy inputs) by determining different membership functions. According to the trichotomy law from the Mamdani fuzzy inference system [24], three different levels of membership degrees for each fuzzy input can be determined by three different levels of membership functions: low membership function, medium membership function, and high membership function. Different levels of membership functions can determine the influence of each fuzzy input on the data transfer decision to different degrees of importance. There are many different reference models that can be adopted in different application scenes in control engineering theory, such as the normal distribution, the F distribution, and the parabolic distribution, and so on. [13,21]. As the remaining energy or cache is below a certain lower limit, the node may be dead or unable to move; and these two values must be below a network adaptable upper limit, so the two-factor fuzzy inputs follow a half trapezoid or trapezoid distribution. To this end, curve trapezoid membership function will be utilized to compute the membership degree of the two-factor fuzzy inputs [23,25] in the proposed improved fuzzy system, namely Curve-trapezoid Mamdani Fuzzy Inference System (CMFI). The traditional trapezoidal distribution membership function is linear, but that proposed in this paper is curvilinear, which is favorable for expressing the smooth relationship between two-factor input and data transmission.
As demonstrated in Figure 1, the remaining cache and energy level are defined as the two-factor fuzzy input of CMFI system, the domain of which can be regarded as U f = {L re (node i ,∆t) , L re (node j ,∆t) and U f ∈ (0, 1]. to express fuzzy input concisely, the two-factor fuzzy input values (the remaining cache level and energy level) can be formalized as the following Equation (A1), in which x node i 1 and x node i 2 represent the remaining energy level and the remaining cache level from the node i , respectively.
Then the domain of factor subset, also namely factor subset, can be transformed as. For a fuzzy subset A from a given domain U f , a fuzzy mapping from U f to an interval (0, 1] can be represented by µ A : U f → (0, 1] , where µ A is a membership function that corresponds to the fuzzy set A. Therefore, a fuzzy inference mapping from the domain U f to the interval (0, 1] can be represented by: Obeying the rule of trichotomy law in Mamdani fuzzy inference system, the fuzzy subsets from the given domain U f could be defined as three different levels including low, medium, and high, the combination of which F(U f ), namely fuzzy power set or evaluation subset, can denoted as where A low , A medium , A high represent low, medium and high levels of membership subsets from two-factor fuzzy input, respectively. In fuzzy set theory, except that the membership degree is 1 and 0, belonging and not belonging have no clear meaning, so the fuzzy power set can be computed by: From the perspective of mathematical theory, F(U f ) is a mapping from the domain U f to the an interval (0, 1]. However, because belonging and not belonging are fuzzy and indeterminate, the fuzzy power set F(U f ) is the general form of the ordinary power set P(U f ), which can be represented as F(U f ) ⊃ P(U f ). Consequently, the fuzzy relationship between two-factor fuzzy input and the data transmission process can be evaluated by a classical Mamdani fuzzy inference system.
After establishing the fuzzy mapping from factor subset to evaluation subset in the domain (0, 1], CMFI needs to define a valid membership function for each fuzzy input. Referring to the membership function of the traditional trapezoidal distribution and the variation trend of node information in the simulation experiment, we set the low, medium, and high membership functions in the proposed curve trapezoid Mamdani fuzzy inference system [21] for the two-factor fuzzy input from the node node i as the following Equations (A5)-(A7), respectively.
For the low membership functions, when the energy and cache level are between 0 and p 1 , the node can effectively process data information. On the other hand, when the remaining energy or cache level is higher than the highest level of the network p 2 , the node will be assigned more heavy tasks from the network and consume a lot of network resources. For the medium membership function, when a node's energy or cache level is higher than the network's highest level p 3 or lower than the network's lowest level p 2 , the node's membership level continues to decline due to heavier network tasks and fewer network resources. For the high membership function, when the energy and cache level is lower than p 1 , the node will die. Moreover, when it is higher than the maximum energy level of the network p 2 , the membership level of this node is 1 all the time. For each of two-factor fuzzy input, three levels of membership degree can be generated from the three levels of membership function. With Zadeh representation notation, the three levels of membership subsets for the two-factor fuzzy input from the node node i can be represented by: Meanwhile, the coordinate graphs of the three levels of membership functions (low, medium, and high curve trapezoid membership functions) are demonstrated in Figure A1, where (a), (b), and (c) represent the low, medium, and high curve trapezoid membership functions, respectively. As shown in the subfigures (a), (b), and (c) and the above Equations (A5)-(A7), p 1 , p 2 , p 3 , p 4 represent four different syncopation points and k is the exponent of different levels of trapezoid function. Beside, three different levels of membership degrees µ low , µ medium , and µ high corresponding to three different levels of membership functions for each fuzzy input (remaining cache level of remaining energy level) can be determined. Due to the requirement of low computation and cache in mobile devices, the three levels of membership function represent the linear distribution between two-factor fuzzy input and membership degree.
(c) represent the low, medium, and high curve trapezoid membership functions, respectively. As shown in the subfigures (a), (b), and (c) and the above Equations (A5)-(A7), 1 2 3 4 , , , p p p p represent four different syncopation points and k is the exponent of different levels of trapezoid function.
Beside, three different levels of membership degrees low  , medium  , and high  corresponding to three different levels of membership functions for each fuzzy input (remaining cache level of remaining energy level) can be determined. Due to the requirement of low computation and cache in mobile devices, the three levels of membership function represent the linear distribution between two-factor fuzzy input and membership degree.

Appendix A. 2. The Fuzzy Inference Component
To be specific, the fuzzy inference component is the mapping from the fuzzy factor subset to fuzzy evaluation subset, where the fuzzy factor subset can be defined as the membership level responding to membership degree and the fuzzy evaluation subset can be regarded as the Figure A1. Three different levels of membership functions for the CMFI system.

Appendix A.2. The Fuzzy Inference Component
To be specific, the fuzzy inference component is the mapping from the fuzzy factor subset to fuzzy evaluation subset, where the fuzzy factor subset can be defined as the membership level responding to membership degree and the fuzzy evaluation subset can be regarded as the membership level responding to transmission evaluation value. From the mathematic expression of the control engineering theory, the fuzzy factor subset of the proposed fuzzy inference system can be formalized as: In the above Equation (A10), FI node i (x node i j ) represents the fuzzy factor subset of different levels of membership degree from the node i , and A x node i j low , A x node i j medium , A x node i j high are low, medium, and high fuzzy subsets responding to low, medium, and high membership degree for the two-factor fuzzy input from the node i , respectively. For each fuzzy input, the proposed fuzzy system constructs three levels of fuzzy factor subsets, so for two different fuzzy inputs, there are six different subsets of fuzzy evaluation, and nine different fuzzy rules and nine different subsets of fuzzy evaluation subset will be generated in the system. Therefore, the mapping relationship R from factor subsets to evaluation subsets can be represented as: Based on the computation method of the maximum membership principle from Mamdani fuzzy inference system [13], the final determined fuzzy subset of each fuzzy input µ x node i for the two-factor fuzzy input from the node node i can be represented as the union before intersection of its three levels of fuzzy factor subsets, which can be shown as: )   Table A1 demonstrates nine fuzzy rules from the two-factor fuzzy subsets to the evaluation fuzzy subsets based on 'If-Then' fuzzy rules of the curve-trapezoid Mamdani fuzzy inference system. From Table A1, it is evident that the impact of the remaining energy level of nodes on data transmission is deeper than those of the remaining cache level of nodes. This is because nodes will die automatically when their energy is exhausted or below the minimum network energy limit; otherwise, when they are not sufficiently cached, nodes simply cannot store the data information from the message carrier but still alive. As shown in Table 1, different combinations of fuzzy subset inputs correspond to different fuzzy subset outputs, each of which indicates the membership degree of the corresponding fuzzy rule on the data transmission process, representing the fuzzy uncertain relationship between the two-factor fuzzy input and data transmission in opportunistic networks [36][37][38]. According to two-condition nine-rule fuzzy inference logic from CMFI, Figure A2 illustrates the classification process of neighbor nodes from source node utilizing trichotomy law, in which active, potential, and under-resourced relay nodes correspond to high (levels 1-3), medium (levels 4-6), and low (levels 7-9) membership levels, respectively. Active relay nodes are users who move frequently and have a high information processing capacity with high energy and cache priority. On the contrary, under-resourced nodes can be considered as dead users due to their low remaining energy and cache space. Potential relay nodes may be regarded as reliable next hop users in the future. Accordingly, the source node will select those active rely nodes as its next hop to route and forward data information through the dynamic fuzzy evaluation of the data processing capability of the relay nodes.
However, this is only a preliminary fuzzy assessment, and the final decision-making requires digital metrics, which can be determined from the De-Blurring component.

2020, 11, x FOR PEER REVIEW 24 of 29
According to two-condition nine-rule fuzzy inference logic from CMFI, Figure A2 illustrates the classification process of neighbor nodes from source node utilizing trichotomy law, in which active, potential, and under-resourced relay nodes correspond to high (levels 1-3), medium (levels 4-6), and low (levels 7-9) membership levels, respectively. Active relay nodes are users who move frequently and have a high information processing capacity with high energy and cache priority. On the contrary, under-resourced nodes can be considered as dead users due to their low remaining energy and cache space. Potential relay nodes may be regarded as reliable next hop users in the future. Accordingly, the source node will select those active rely nodes as its next hop to route and forward data information through the dynamic fuzzy evaluation of the data processing capability of the relay nodes. However, this is only a preliminary fuzzy assessment, and the final decision-making requires digital metrics, which can be determined from the De-Blurring component. The selected next hop Figure A2. Fuzzy inference and node classification based on "If-Then" rules from the CMFI system.

Appendix A. 3. The De-Blurring Component
The proposed curve trapezoid Mamdani fuzzy inference system uses the maximum membership principle to compute the corresponding membership degree, which also can be expressed as the method of "first OR then AND" operation [13,22]. However, fuzzy evaluation subsets can only be regarded as the primary classification of nodes in terms of reliability of data transmission, rather than Figure A2. Fuzzy inference and node classification based on "If-Then" rules from the CMFI system.

Appendix A.3. The De-Blurring Component
The proposed curve trapezoid Mamdani fuzzy inference system uses the maximum membership principle to compute the corresponding membership degree, which also can be expressed as the method of "first OR then AND" operation [13,22]. However, fuzzy evaluation subsets can only be regarded as the primary classification of nodes in terms of reliability of data transmission, rather than the final decision assessment value. To this end, the De-Blurring component will be constructed to determine a precise classification value for node classification, judging whether the current node node i is a reliable next hop or not. In the classical fuzzy inference system, De-Blurring component could adopt a centroid method, average weighting approach, or maximum membership method. Because the independent variables with small changes in the centroid method will present large differences in the dependent variables, contributing to a relatively high classification accuracy, it could be adopted in the proposed fuzzy system.
To provide the way of manual adjustment of parameters, we define W (w 1 ,w 2 ) as the weight matrix for two-factor fuzzy input U f = L re (node i ,∆t) , L re (node j ,∆t) , consisting of two different weight values w 1 and w 2 for L re (node i ,∆t) and L re (node j ,∆t) , therefore the fuzzy relationship matrix Mr can be defined as: Combined with the weight adjustment matrix W (w 1 ,w 2 ) , the fuzzy transformation from factor set U f = x node i j to evaluation set F(U f ) can be defined as: high represent different levels of single fuzzy control result of two-factor fuzzy input for low, medium, and high levels of fuzzy subset, respectively. After normalization, these three levels of single fuzzy control result can be formalized as: According to the maximum membership principle, the final fuzzy control result FC node i is the maximum of three single membership control values after the first "OR" then "AND" operation, and this process can be expressed as the union of f c To effectively distinguish the impact of two-factor fuzzy input on data transmission, we adopt the centroid method to evaluation the final fuzzy control result. Following the centroid method, the proposed fuzzy system utilizes the method of first "OR" and then "AND" operation to determine the final fuzzy control result, where the control result value of each single fuzzy inference rule can be calculated by "OR" operation and the minimum control result value of nine fuzzy inference rules will be computed by "AND" operation. To be specific, as shown in Figure A3, if the high membership degree of one fuzzy input is the coordinate point P(p 2 , 1), then the blue shaded region is the control result for this type of fuzzy rule by "OR" operation. Moreover, for two-factor fuzzy input, this fuzzy system generates six different subsets of fuzzy evaluation, nine different fuzzy rules, and six different blue shaded areas by "OR" operation., On the basis of "AND" operation, the overlap of these six blue shaded areas, which is the blue shaded areas shown in Figure 6, can be regarded as the final control result for the coordinate point P(p 2 , 1).
Obeying the formal expression of mathematics, we adopt the centroid of the overlap of these six blue shaded areas to represent the final fuzzy control result for two-factor fuzzy input L re (p 2 , 1) and L c (p 2 , 1), which can be shown as: where FC node i represents the final control result for the node node i with six different single blue shaded areas, and then x node i b and y node i b respectively represent abscissa value and ordinate value at the boundary coordinates of the blue shaded area in Figure A4; besides,n is the number of the boundary coordinates of the blue shaded area in Figure A4. Actually, x node i b and y node i b are, respectively, the fuzzy input and the corresponding membership degree from the remaining cache and energy level of the node node i . During the data transmission process in opportunistic social networks, the source node or message carrier make a reliable data routing or forwarding decision by computing and comparing transmission evaluation value (the final control value) of their neighbor nodes. ( ,1) L p , which can be shown as:  Figure A4; besides, n is the number of the boundary coordinates of the blue shaded area in Figure A4. Actually, i node b x and i node b y are, respectively, the fuzzy input and the corresponding membership degree from the remaining cache and energy level of the node i node . During the data transmission process in opportunistic social networks, the source node or message carrier make a reliable data routing or forwarding decision by computing and comparing transmission evaluation value (the final control value) of their neighbor  respectively, the fuzzy input and the corresponding membership degree from the remaining cache and energy level of the node i node . During the data transmission process in opportunistic social networks, the source node or message carrier make a reliable data routing or forwarding decision by computing and comparing transmission evaluation value (the final control value) of their neighbor nodes.