Comparative Agent-based Simulations on Levels of Multiplicity Using a Network Regression: A Mobile Dating Use-case

: We report an agent-based model to compare the effectiveness of simple and complex mobile dating application interfaces in generating matches for virtual users. We deﬁne the relative complexity of dating applications as the number of available features and dub this variable, the multiplicity. We replicate some of the most popular mobile dating applications through the generation of a synthetic population endowed with attributes, preferences, and behaviors drawn from literature. We treat our data as a network dataset and use a robust statistical procedure (MRQAP) to issue a valid and reliable comparison between simulated applications. We show how the quadratic assignment procedure can be used to compare network simulations rigorously. As a result, we observe a direct relationship between multiplicity and agent-level experiences and expectations in match generation. We also observe the emergence of divergent matching systems with minor rule changes as well as several expected properties of online dating systems. This work serves as a proof-of-concept in the integration of classical social network analysis methods with agent-based modeling to compare virtual designs 11 and to enhance the policy-generation process of online social networks.


14
Mobile dating applications have experienced dramatic growth in user adoption with 15 of the 15 most popular commercial instances acquiring 247 million downloads in 2018 alone [1]. Currently, 16 roughly 15% of all U.S. adults report using a dating app to find a mate. By 2021 the market focused 17 on virtual matching is expected to reach $3-$5 Billion and year-over-year growth of 25% [2]. The 18 surge in mobile dating revenues is driven by several factors, including more accepting social norms 19 and better accessibility. In a 2005 study conducted by Pew Research Center, 44% of Americans 20 indicated that "online dating is a good way to meet people" [3]. By 2013, 59% agreed with this 21 statement and 21% indicated that "online dating has lost much of its stigma." 22 Today's mainstream mobile dating applications offer common features and a market-driven 23 (emerged) universal user interface, but they continue to vary in levels of multiplicity-that is-the 24 number of features available to users in order to achieve some desired result. The desired result is 25 often finding a suitable mate but not always [4]. Although previous research has explored several 26 aspects of online dating e.g. recommender systems [5,6], online disinhibition [7,8], mate preferences 27 [9][10][11], self-presentation [12][13][14], few works have tended towards a system-level agent-based 28 modeling approach (see [15,16]). 29

30
Tinder is one popular mobile dating applications partly due to its simple and user-friendly 31 interface. This single application generates an estimated 15 million matches daily [2]. Once installed 32 on a mobile device new users are prompted to create a profile to showcase themselves to potential 33 matches. Profiles consist of images uploaded by the user and a single text-based input field for that 34 can be used as a profile summary. More recently, a field in the user interface for "Interests" was also 35 incorporated. Upon completion of the user profile users can choose matching preferences based 36 on age, gender, distance, location and other attributes. Users sort sequentially through profiles of 37 potential matches shown to them by the application's recommendation algorithm. Users can then 38 perform an action to either accept or reject any given profile presented to them by the embedded 39 algorithm. After each decision action has been made, a user is presented with a new profile and 40 the acceptance/rejection process is repeated. Consequently, when two users 'like' each other, a 41 notification is displayed. This is deemed a 'match' and users are now able to communicate directly 42 in a private chat room. 43 As Tinder's popularity grew, other mobile dating applications emerged with their own features 44 and interfaces. Hinge-another popular mobile dating application-allows users multiple pathways 45 to match, extending the Tinder-like accept/reject functionality while providing more powerful search 46 features. Hinge allows users the option to create more detailed profiles that specify ethnicity, political 47 views, desire to have children, and other common attributes and preferences. Users can also enrich 48 their profiles with clever prompts that enhance their appeal to visitors; profile visitors can 'like' 49 these or comment on them. Like Tinder, when mutual likes occur users are then 'matched'. A key 50 difference between Tinder-like applications and Hinge-like applications is the requirement that users 51 mutually 'like' one another prior to activating communication features. Hinge-type applications 52 allow users to send messages concurrently with 'like' actions if they desire, while Tinder-type 53 applications require a match before communication can occur. This enables communication as 54 an embedded feature in the 'like' request for applications similar to Hinge and allows for more 55 specialized matching strategies. The ability to send a message with a like request differentiates 56 Hinge-type from Tinder-type applications in important ways and creates more decision pathways to 57 reach a match, which we dub, multiplicity. We argue that the multiplicity of dating applications plays 58 a principal role in aggregate system outcomes and the overall experiences of users. Moreover, we 59 argue that users can navigate the range of available applications to maximize their dating potential and 60 match utility based on the advantages or disadvantages of the system-level multiplicity of considered 61 applications. To that end, users who have not generated much interest when using Tinder-type 62 applications can opt to utilize a user interface that allows for a more specialized matching strategy; 63 perhaps one that places more emphasis on tailored personal messages rather than strict personal 64 attributes (e.g. age, ethnicity, education). A more specialized strategy can only be undertaken on 65 dating systems that allow for multiple pathways to matching-those with a higher multiplicity. 66 Consequently, multiplicity could be directly correlated with match success for non-majority groups 67 who must select a specialized mate-selection strategy often divergent from the mainstream. 68 In this paper, we construct principal models of two dating application agent-based simulations, 69 one with a low multiplicity representing applications such as Tinder, and another with a higher level 70 representing applications such as Hinge. We dub them Multiplicity Level 1 (M1) and Multiplicity 71 Level 2 (M2), respectively with the lower rank (M1) representing Tinder-type applications and the 72 higher rank (M2) reserved for Hinge-type applications. Both models were assigned rule-sets with the 73 purpose of replicating their respective real-world counterparts. Our approach combines agent-based 74 modeling and classical social network analysis as our go-to analytical frameworks. As outputs 75 from our simulation models we construct networks containing various link types reflecting agent 76 interactions through the simulated mobile dating application. We define a 'like' interaction as a 77 one-sided, directed tie representing interest of a sender agent to some receiving agent, a 'match' as 78 dyadic interest of two agents (two-directional 'like' links), a 'dislike' as a directed rejection, and 79 a 'message' as a combination interaction (mechanistically, we model this as a 'like' with a higher 80 probability of a response). Through implementation of numerous simulations we investigate the 81 contribution of various mechanisms in generating matches.

82
This paper advances the state of the art in social network simulations in several ways. Primarily, 83 the paper relies on social science theory to simulate an application's interface using well-studied 84 social effects drawn from peer-reviewed literature in the absence of relevant data which were often 85 unobtainable. Although this approach is not common in the social sciences it is necessary for our 86 specific inquiry: Public datasets do exist but in two forms, each of which are either not useful or 87 offer very limited use: The first is aggregate data from non-peer reviewed sources, much of which 88 is not held to high empirical standards in collection or reporting, mitigating proper and standard 89 evaluations of data integrity. Much of this data comes in the form of summary totals of likes and 90 matches listed on Internet blogs by commercial entities. It has been shown that the connection 91 between aggregate data sources and agent decision rule probabilities is generally non-linear and 92 mathematically intractable [17] rendering such data sub-optimal for agent-based model development, 93 and in the absence of complete aggregate datasets (joint distributions for example), we consider 94 them to be insufficient for external validation and thus unusable. The second form-which we utilize 95 in a stylized manner-is drawn from empirically reviewed sources (e.g. [18]), and this collection of 96 data and models offer statistical agent attribute and behavioral insights into the mechanics of online 97 mate selection. We consider them sufficient in offering insights towards our central question: How 98 does the multiplicity of dating applications affect agent and aggregate matching outcomes? Thus, 99 we make good and reasonable use of this collection. We rely on homophily [19][20][21][22] as a primas inter pares foundational driver for our virtual 102 experiments. We incorporate this sociological framework in our models-though admittedly-ancillary 103 sociological processes contribute to the overall potential of matching and mate selection on online 104 dating platforms. The principle of network homophily implies that "people's personal networks are 105 homogeneous with regard to many socio-demographic, behavioral, and interpersonal characteristics" 106 [23]. Homophily is a powerful social dynamic which often indicates that agents prefer those who 107 are similar to themselves in many areas of social, economic, and cultural discourse. As a cognitive 108 process, it is shown to be correlated with incentives for positive interactions resulting from shared 109 knowledge, beliefs, and attributes. In the context of online dating, perceived similarity has been 110 shown to correlate with positive attraction. [24] measure the effects of homophily in mate selection 111 by sampling a pool of users on an online dating application and analyzing their preferences for 112 potential partners in a range of categories. The authors analyze similarities such as desire for 113 children, level of education, and physical appearance. They report that similarity increases the 114 likelihood of a match. There is much evidence along this line of inquiry that support this finding 115 [25? -28]. Specifically, we incorporate general notions of homophily in age, physical attractiveness, 116 ethnicity and gender using a mix of discrete uniform categorical (ethnicity, gender) and continuous 117 uniform (age, attractiveness) input attributes. Restricting our model to uniformity in the generation 118 of agents' attributes provides a baseline of analysis and reduces the possibility that our conclusions 119 are influenced by our agent population generative choices. Thus, homophily is a reasonable choice 120 as the theoretical framework for model design. We use the Python module NetworkX [29] to build our model and run simulations. NetworkX 124 is a module of the Python programming language that allows for the manipulation of network 125 data and/or simulations through edge list objects or adjacency matrices. We use NetworkX's node 126 functionality to represent the agents of our model, and the link functionality to represent interactions 127 between the agents (likes or messages). 1

128
To ease the process of comparing outputs from our models, the number of agent attributes 129 were kept to a plausible minimum and kept consistent across both models. Agent populations were 130 endowed with a binary gender (male and female), a generic ethnicity chosen from 4 categories, 131 physical attractiveness (between 0 and 1), and age . All agent attributes were assigned 132 uniformly. In both model variations (M1 and M2), we restrict population size to 1000 agents. To 133 duplicate users' time constraints and in some instances online applications' fixed limits (Tinder fixes 134 the number of likes per day that a user can generate under their basic plan), through every turn of the 135 simulation agents are presented with no more than 40 agents of the other gender to consider. Dating 136 applications will often present users with profiles within a set range as well and so agents in our 137 models are only presented with those who are within 10 years of age difference. Finally, each agent 138  Table 2. Rules applied by agents for calculation of overall compatibility (total score). Compatibility is assigned as the additive accumulation of all assigned attributes (age, attractiveness, ethnicity). Second column conveys conditions under which the rule calculation is presumed. All value calculations and parameters are assigned such that they will produce a maximum partial score of 1. Independent compatibility scores are then added to produce overall compatibility score.
Compatibility Score (attractiveness) evaluates a compatibility score of the considered agent and records an appropriate 'like' or 'dislike' 139 response based on heterogeneous thresholds.

140
A compatibility score is a measure based on the attributes of alters (considered agents) and the 141 preferences of ego (the agent). Applying a basic threshold criteria, if a given compatibility score 142 exceeds ego's compatibility threshold, the probability of a "like" is set to some maximum-allowed 143 parameter value. If this threshold is not reached, the probability of a "like" is minimized. We chose 144 probability parameters arbitrarily in the absence of data but relied upon intuition to guide their 145 relative magnitudes. If assembled agents record a bi-directional "like", a "match" is recorded. This 146 threshold assignment is consistent with social threshold models [30] and their dating counterparts 147 [11]. We summarize all agent rule calculations in Formula 2.

149
Agents in our model favor homophily in ethnicity and attractiveness as referenced in [23], 150 while holding differential preferences in age (male agents preferring younger female agents and 151 female agents preferring older male agents). As agents are randomly selected in every turn of the 152 model they will evaluate whether to like each other based on these attributes and a previously set 153 threshold. Generally, threshold levels determine the likelihood of a like or dislike event occurring 154 and are summarized in Figures 1 and 2.

155
Parameter estimates are drawn from several peer-reviewed sources from within the literature 156 on online dating. To begin, male agents were given a Gaussian preference for younger female agents 157 where the mean difference was 3.996 years, and female agents were given a preference for older 158 males with a mean difference of 2.046 years [31]. The compatibility score is intended to ensure 159 that maximum score contributed by age resides at this mean, and the greater the difference from 160 this mean difference, the lower the contribution of age to the total compatibility score. Physical 161 attractiveness and ethnicity are based on similar principles. As we have noted, agents evaluate 162 ethnicity and attractiveness based on the concept of homophily with much higher scores attributed 163 when two agents of the same ethnicity or a similar attractiveness interact. 164 We also utilized a weighing mechanism for both male and female agents as genders may 165 value attributes differently when conjoint with more attributes. The weights are multiplied by the 166 agent-agent interaction scores then summed to yield a total score. This total score is then compared 167 to a threshold value to determine the occurrence of a like relation. In line with [32]'s study showing 168 that males value physical features more than females, male agents were given a higher weight 169 (β M 3 = 0.4) for attractiveness preference than for female agents (β F 3 = 0.3).
[33] concludes that 170 females place greater weight in finding a potential match within their own ethnicity and so we 171 adjusted the weighing and contribution for ethnicity to account for this ( . 172 Finally, the weighing of scores on age by both genders remained the same (0.3) as there was little 173 evidence that the contribution to a total score was different from both genders even in light of 174 different age preferences for males (µ = −4.0) than for females (µ = 2.05).

175
Since our model is subject to theoretical matching system fundamentals [34], synthetic popula-176 tion structure and heterogeneity play an important role in model specification. The choice of male to 177 female agent ratios could-if chosen unwisely-create unintended consequences and artifacts in 178 our model. Thus, we rely on [3]: The author reports that Tinder's user base is roughly 60% male 179 and 40% female. Though this statistic is a rough estimate drawn from non-peer-reviewed sources, 180 therefore we consider it a guide (and not a rule) to our implementation and model specification. As a 181 result, we chose population proportion parameters consistent with the overall notion (that there are 182 many more males than females on dating applications) at a rate of 68% men and 32% women in line 183 with the general principle of a 2:1 ratio.

184
Once agents are presented with a profile to evaluate they proceed with said evaluation (S ∈ [0, 1]) 185 against a pre-set parameter to determine whether they will like the considered agent(s). These 186 threshold parameters are assigned arbitrarily in the absence of data but follow the following intuitive 187 patterns: we assume that x ∈ U [0, 1] = 0.5, then F 1 will view her compatibility with M 1 as 0.3 + 0.3 + 0.3 = 0.9 201 or 90%. The M1 model specifies that since her compatibility score is greater than the designated 202 threshold, 0.5 (Figure 1), then there is an 80% probability of a "like" occurring. If the compatibility 203 score was less than 0.5, the probability of a 'like' would be 30%.

204
This previous example illustrates the general mechanics of the M1 model (e.g. Tinder) where a 205 message may not be sent with each like interaction. In model M2's specification (e.g. Hinge) two key 206 differences exist. Firstly, agents may choose to send a 'like' or a 'like with a message'. We assume 207 that a like with a message increases the probability of a reciprocated like (we assume messages are 208 positive and well-suited). Consequently, it should be evident that the second key difference between 209 our specified M1 and M2 Models is that ego in M2 considers the like of the alter in their decision to 210 like alter. This latter effect is known as reciprocity [36] and is a direct result of the difference in our 211 hypothetical application's user interfaces and multiplicity-the difference being that applications such 212 as Hinge (M2) allow users to attach a message with a like prior to matching, while applications such 213 as Tinder (M1) do not. This forces users to evaluate incoming interactions not only based on the 214 sender's attributes, but on the message attached to the interaction as well. Finally, we should report 215 that we chose to specify that all decision points are Bernoulli tests in (U[0,1]) and that coefficients 216 attached to our formulaic rules (Figure 1) scale the compatibility score to be within [0,1] to ensure a 217 one-to-one comparison scheme. As we have noted, our objective was to compare two models with varied multiplicity levels 220 in terms of aggregate outcomes. We chose to focus our effort on matches that agents accumulate 221 in each model. We carried out a formal inquiry of the statistical parameters and distributions of 222 likes, messages, and matches as well as evaluated conditional distributions through a network 223 regression-the Multiple Regression Quadratic Assignment Procedure-a suitable method for our 224 analysis that accounts for our model's rule dependencies without inappropriate assumptions. While 225 the literature on methods of network comparisons is copious [37], we chose MRQAP for its minimal 226 assumptions and for its relative ease of interpretation-reducing output comparison to a comparison 227 of the coefficients of a regression model.

228
The Multiple Regression Quadratic Assignment Procedure (MRQAP) is used to test for sig-229 nificance in an observed correlation where dependency between two or more dyadic relations may 230 exist [38]. MRQAP relies on a non-parametric permutation-based test that preserves the integrity of 231 the observed network structures to report confidence. This approach was originally developed by 232 [39] to identify geographic clustering of diseases [38]. Mantel and colleagues noted that covariates 233 in geographic datasets are highly co-dependent and typically not independently and identically dis-234 tributed (i.i.d.) due to relational processes. Consequently, the covariates are unsuitable for regression 235 models that assume independence of observations-a common example is the method of Ordinary 236 Least Squares (OLS).

237
MRQAP has been developed and deployed as a mainstream network analysis tool. The proce-238 dure is particularly useful when calculating coefficient magnitudes and parameter estimates, and 239 as a corollary, the strength of social effects-through permutations of network statistics. Since our 240 synthetic population samples, rules, attributes, and mechanisms are not i.i.d., we relied on this 241 standard social network analysis model to compare outputs from our simulations. We omit a compre-242 hensive discussion of the nature of this method at this time, but direct the reader to aforementioned 243 citations. However, to summarize the method's mechanics: MRQAP provides regression coefficients 244 as does classical regression models (ordinary least squares), but its assumptions to not fall within 245 the General Linear Model (GLM). Since we cannot guarantee some of the basic assumptions of 246 OLS when considering relational data structures-assumptions such as bivariate normality or the 247 independence of observations-MRQAP calculates coefficient estimates but permutes the dependent 248 and independent variables on the structure of the network [40] yielding conditional p-values. Simply 249 put, MRQAP answers: How likely is a model, given a randomization of the dependent variable on 250 the network's structure?

251
As we have explained, MRQAP is well-suited for our comparative aims: Consider that M1 252 (Figure 1) emulates the simplest dating application interface-an example we repeatedly cite is Tinder. 253 We begin our analysis by considering likes and matches as edges in a network (ties), and thus we 254 frame our output analysis as a network analysis with edges within both simulations' code-base 255 reflecting all temporal interactions through the application's interface as edges in a network. Our 256 analysis is then conducted using the cross-sectional network emerged from the temporal interactions. 257 Framing our analysis in this way allows for the use of MRQAP and its powerful permutation-based 258 statistical assumptions while accounting for non-independence in our model with minimal interfering 259 assumptions.

260
To compare M1 to M2, we consider their shared output; In our case the common output is 261 matches. We ensure that inputs among the two models are identical for a one-to-one comparison. A 262 synthetic population instantiated for M1 can apply M2's behavioral rules without loss of generality. 263 We use ethnicity, age, physical attractiveness as input attributes, and likes as an input variable in the 264 network regression model. In turn, if we observe a statistical difference in the models' parameter 265 estimates, given that we have randomly generated synthetic populations in both models according to 266 the same rules, then this difference must be due to the difference in agent behaviors between the two 267 models and not due to the attributes of the agent population. In our case, the difference between 268 M1 and M2 lies squarely in an increase in multiplicity. The M2 model allows for the sending of a 269 message with a like and M1 does not-hence, reciprocity as a social effect. In Section 4 we present 270 typical results from a representative run of 1000 nodes for both model types and Figure ?? contain 271 visualizations of the agent decision processes in both simulations.

273
Our primary process of internal validation was to ensure that all synthetic populations were 274 generated to our strict requirements of uniformity for attractiveness [0,1], age , and ethnicity 275 (categorical) [A, B, C, D]. Both tables ?? show the results of a representative comparison between 276 generations of a synthetic population. The tables show agreement with our expectations and supply 277 confidence that our code-base is reflective of our models' specifications. Since our goal is to quantify 278 the difference of the models based on the given multiplicity of dating applications or the structure of 279 this particular system, no further statistical validation of agent attributes was necessary since the 280 outputs-given-structural-differences is the measure of comparison in this case.

281
To elaborate further, we would expect that the mean attractiveness for our population would be 282 roughly 0.5 since we designated this variable as U[0,1], ≈ 42 years of age as the mean value between 283 18 and 65, and for ethnicity-a categorical variable-to be equally scaled among the 0 th , 25 th , 50 th , and 284 75 th percentiles (lower bounds). This is clearly shown in our reported tables.    The general shape and location of the probability distributions of likes ( Figure 4) was similar 288 by design in both M1 and M2 with male agents generating slightly more likes than female agents. 289 Male and female agents in both models liked and disliked other agents according to a bell-shaped 290 distribution. This was anticipated and is likely a result of the use of Gaussian differential inputs in 291 age and attractiveness. Because both models can essentially be viewed as matching processes, lower 292 counts of female agents meant that male agents will 'like' more frequently given a fixed amount 293 of activity per turn. Overall, there was no statistical difference in the total number of likes when 294 comparing M1 to M2 without conditioning on gender. Table 7 shows the result of a paired t-test 295 comparing the mean difference between statistics in M1 to M2. Excepting 'like' relations, which we 296 designated to be equivalent in both simulations, we found statistical differences between the model 297 types. As shown, M1 (e.g. Tinder) produced more likes, dislikes and matches but less messages (by 298 design) than M2 (e.g. Hinge).

299
The similarity in the liking distribution however did not result in equiprobable matching 300 distributions ( Figure 5) either by gender or by model type. M1 model mechanics (Tinder) generated 301 more aggregate matches for both male and female agents than M2 and as measured by median 302 value. Dispersion was greater for both agent genders in M1 when compared to counterparts in M2 303 simulations. Female agents in both simulations received more matches than male agents by mean and 304 median values. The probability distribution for female agent matches exhibited a longer tail, with 305 top female agents receiving 3 times the number of matches of median males in M1 and 5 times the 306 median male agents in M2. This implies a greater clustering of matches for female agents than for 307 male agents. However, the median number of matches for male agents in both model types was much 308 more probable than its female agent analogue with male agent median value in M1 representing 7% 309 of sample compared with 4% for female agents with 11% and 5% for M2 agents respectively.

310
Messaging distribution ( Figure 6) analysis was relatively straightforward to consider since 311 agents in M1 simulations did not possess the ability to send messages. M2 agents sent a median of 1 312 message for male agents and 2 messages for female agents and the messaging probability distribution 313 carried a left skew with some agents sending more than 10 message per simulation run.

314
Tables ?? provide summary statistics for like, dislike, match and message distributions for a 315 representative run in aggregate form and act as an additional internal validity test for simulation 316 runs. Of central interest is the mean number of matches (µ) generated by M1 when compared to 317 M2-almost precisely double. What is of additional significance is the variance reduction and shorter 318 tail of M2. Specifically, the coefficient of variation for M1 is 0.42 while for M2 we calculate it as 319 0.53. This suggests a more even and equal distribution of matches among nodes-or in real-world 320 terms, agents who were previously unsuccessful in receiving many matches in M1 receive more 321 matches in M2, hinting towards the emergence of specialized strategies.

323
By framing our outputs as networks of interactions between agents, the resultant 'like' and 324 'match' networks shown in Figures ?? gain significance. Since much of the mechanism of the 325 underlying agent rules was intentionally stochastic so would be our outputs. Nevertheless, non-326 random structure is evident and it can be presumed as having emerged from model rules. M1's 327 match network produced a large component and many isolates, indicating that small preferences (in 328 age for example) can produce clusters of nodes where there is high matching probability. The M2 329 match network showed highly dyadic structure (rather than transitive, as in in M1) suggesting that 330 agents matched with one or two other agents, but not within larger clusters of agents as in M1.

331
To ensure a principled comparison, we utilized the network regression technique-MRQAP 332 [39]-to investigate the relationship of our monadic and dyadic covariates. MRQAP requires that 333 we convert node attributes such as attractiveness, age, and ethnicity into difference (or similarity) 334 adjacency matrices. Intended as a one-to-one comparison, the models in Table 8 included the 'like' 335 network, attractiveness difference, age difference and ethnicity difference as independent/predictor 336 variables and the 'match' network as the dependent variable. We consider the effects of covariate 337 inputs on producing a match as an output given the multiplicity of our two model types.  Reported are three models of interest-a single model describing M1 simulation results and two 339 models for M2 simulation runs, the first of which is a model without an effect for messaging-a false 340 model-the second of which (M2-T) includes the 'message' network as an independent variable. Of 341 note are the order of magnitude for all parameter estimates, which tend towards remarkably small 342 values. For example, the M1 like relation estimate is 0.255; Assuming variable independence and 343 given in absolute terms, it took roughly 4 likes to produce a match for M1 agents. This is in line 344 with expectations about the density of the like network when compared to the density of the match 345 network-the former expected to be more dense and the latter more sparse. 346 Confidence in the reported parameter estimates is conveyed by the proportion of network 347 permutations that meet lower-end one-tailed Pr(>=), upper-end one-tailed Pr(<=), and two-tailed 348 tests Pr(> | = |). These reports communicate important features of the simulated systems. For 349 example, 100% of the ensemble of networks permuted by M1 regression resulted in parameters 350 for the like relation that were smaller (<=b)) than the observed effect size of 0.255 and that were 351 larger than (>=b)) -0.0061 for M2-F and -0.0061 for M2-T. This represents high confidence that 352 this parameter differs significantly from a random observation of the same structural network. Also 353 reported is a two-tailed measure of confidence (Pr(> | = |)) expressing the skewness of the sampling 354 distribution of estimates.

355
The 3 models presented in Table 8 explore a true model for M1 with all available monadic and 356 dyadic effects, a false model for M2 (M2-F) without the messaging network, and a second model for 357 M2 with messaging included. It was clear that all models are significant when compared to random. 358 M2-F generated a model with a much higher standard error (0.25) when compared with M1 (0.097) 359 and the true M2-T model (0.076). The coefficient of determination for the M1 model (R 2 = 0.18) 360 was greater than both M2 model equivalents, given that the number of agent decision points needed 361 to achieve a match are more numerous in the M2 model regressions. Each of these decision points 362 are subject to additional (uniform) random draws. This increases the amount of uniform random 363 noise in simulation runs but leaves the reliability of the models intact. Table 9. Parameter estimates ordered by contribution to match network (left compartment) from large values to smaller values (including negative values), and by magnitude rank (overall contribution). e.g. while the like relation for the M2-T model was a top contributor to the match dependent variable (rank by magnitude = 1), it was also a negative parameter estimate placing it 5th in the order of contribution.

Order of Contribution
Rank by Magnitude intercept X X X X X X Beyond model-level reliability and validity our virtual experiment provides agreeable results. 365 We rank coefficient estimates in Table 9 by order of contribution (from positive estimates to negative 366 estimates) and by coefficient size (magnitude). Here we are observant of the coefficient estimates 367 and the direction and distribution of likely parameters generated by the permutations of the quadratic 368 assignment procedure. That is-we are interested in rank order (of appearance), magnitude (absolute 369 value), and the ensemble of networks with an exact structure to our observed network. Figure 11, 370 Table 8, and Table 9 summarize these quantities.

371
The M1 regression model produces a straightforward linear system where more 'like' activity 372 produces more matches. This is due to a positive intercept, positive like parameter estimate with 373 a valid test statistic extant on the upper tail (Pr(b <=)), and greater differences in attractiveness. 374 We find that only 55% of networks produced higher estimates than observed. Primarily, this is 375 due to the systemic differences in how male and female agents evaluate age with males preferring 376 slightly younger females and females preferring slightly older males. As a consequence, estimates 377 are centered in a range that describe opposing positive and negative difference in age. By rank order 378 of contribution, the like variable is the largest contributor to matches, followed by attractiveness, 379 ethnicity, and then age. Ranked by magnitude, the like relation is the primary contributor to matches, 380 while attractiveness and ethnicity contribute along the same rank, and age contributes (in absolute 381 terms) roughly 5 times less than whether an agent likes another of an ideal difference in age versus 3 382 or more standard deviations away, placing it 3rd in magnitude rank.

383
It was a useful exercise in model validity to report two models for the M2 dating system-one 384 which included the messaging network (M2-T) and one which omitted it (M2-F). The first clue in 385 the non-applicability of M2-F's regression model is the relatively high residual standard error (0.25) 386 when compared to the true model (0.076), an order of magnitude of difference. The non-inclusion of 387 the messaging network as an independent variable caused the quadratic assignment procedure to 388 shift the (positive) estimate for attractiveness and ethnicity to a negative estimate and assigned the 389 parameter estimate for ethnicity a much smaller weighing (from 10 − 6 to 10 − 4). Under conditions 390 where this data may have been gathered from a real-world sample one may be inclined to accept 391 these differences at face-value, but because we know that a proper model must be inclusive of the 392 messaging network-included in M2-T-since the behavior was incorporated in the simulation's 393 specifications, we know that it must be false. The greater clue comes from the rankings of effects 394 by magnitude ( Table 9) which shows that M1 and M2-T have emerged the same order of effect 395 ranking-that is-the variables contribute similar amount to the creation of a match for agents, while 396 M2-f conveyed a differing order.

397
Greater confidence in M2-T as a valid model begins with the standard error of the model, which 398 was roughly equivalent to M1 (0.076, 0.097 respectively). M2-T produced estimates for the like 399 and messaging networks that were strong by rank (first) followed by attractiveness and ethnicity 400 (second) and finally, age (third). This ranking of parameter estimates for M2-T is identical to M1 401 with excepting the messaging network rank which is not applicable in the case of the M1 model. 402 The order of contribution was divergent however; While the like network was the strongest predictor 403 of a match in the M1 regression model by (positive value, by order 1st), the like network was the 404 weakest predictor of a match in the M2-T model (negative value, by order 5th). Agents in M2 could 405 not rely on 'liking' alone to generate a match-in fact, liking many profiles resulted in less matches 406 overall. This is a direct result of the additional multiplicity of the model, i.e. the addition of the 407 messaging feature. Interestingly, attractiveness (by order, 2nd) held its order position in both models. 408 Age was more relevant in M2 (by order, 3rd) than in M1 (by order, 4th) and messaging did not exist 409 in M1 but was ranked 4th in M2, by rank order.

410
The third component of our findings was the likelihood function of parameter estimates 411 describing the ensemble of networks permuted by MRQAP. Figure 11 summarizes those results 412 through a direct comparison of the 3 models. The intercept likelihood sub- Figure 11 uniquely 413 represents the divergence of Tinder-like applications and Hinge-like applications with intercept 414 estimates for the M2 dating system far exceeding M1. This indicates that M2 agents initialize with a 415 higher likelihood of matching-holding all else equal-but due to negative parameter estimates for 416 liking and messaging must be more tactical with liking and messaging decisions than M1 agents. 417 This hypothesis is uniformly confirmed through an increased variance of estimated parameters for all 418 M1 (Tinder-like model) variables and increased certainty of estimates in both M2-F (false Hinge-like 419 model) and M2-T (true Hinge-like model) estimates. In general, M2 parameter estimates for the 100 420 permuted networks under consideration were more likely and in some cases (age and ethnicity) were 421 twice as likely as their M1 counterparts. The increased likelihood and less variant distribution of 422 parameter estimates for the M2 system specifies a narrower corridor of conditions that must be met 423 in order for matches to occur and as a consequence agents should employ specialized strategies in 424 messaging. This finding is robust despite that agents in either simulation models are zero-intelligence 425 (see [41]).

427
To ensure that our results were independent of our parameter choices, we tested an elementary 428 adaptation of each model simulation using a single agent rule. Agents in this base model evaluate the 429 overall attractiveness of other agents through the uniformly assigned physical attractiveness attribute. 430 The attractiveness score is then adopted as the overall score rather than combined additively with other 431 scores. Analysis of the base model helps to determine the likelihood of bias due to design artifacts in 432 our model. Similarly to our previous approach, we simulated both a Tinder-like application (M1) 433 and a Hinge-like application (M2). Since our aim is not to compare our simulations to extant data 434 but to each other, this step is not wholly necessary but is useful for model validity.  Both M1 and M2 simulations produced a right-skewed 'like' distribution with many agents 446 receiving a small number of likes while a small number of agents receiving a disproportionately 447 larger number of likes. Under the generally uniform choices we imbued agents with, we would 448 not expect that outputs be heavy-tailed. There is strong evidence of skewness from peer-reviewed 449 literature in [11] where the authors show a right-skewed distribution of message interactions that 450 yield a match and this is also directly confirmed in [18]. Our own finding that many agents received 451 few matches while a minority received many matches is confirmed by the finding in [11] as well. 452 Furthermore, [18], a study conducted on Tinder profiles and interaction data provides the relative 453 proportion of likes and matches, showing that the number of likes compared to the number of 454 matches is larger by 2 or 3 orders of magnitude, consistent with Tables ??. 455 Though we surmise that skewness-as a system property-likely depends on the scale of dating 456 platforms (e.g. the number of active subscribers) it is likely that if the number of agents were 457 to be larger by an order of magnitude or more, then by sheer chance alone, the proportion of 458 agents receiving no matches would be higher-as top agents would continually gather a larger 459 share of all available matches. Nominally, whether the well-studied scale-free (power law degree 460 distribution) property [42] is an adage of our system is an interesting theoretical question that 461 arises here. With ever-larger scale (Tinder is larger in membership count than Hinge) would the 462 skewness and consequently the tails of our 'like' statistical distribution approach some extreme-value 463 and heavy-tailed habitat that represents some biased social process. In the case of the classical 464 scale-free property, the underlying social process is generally taken to be preferential attachment 465 when the dataset under question is analyzed as a network. Though this question provokes a revision 466 to our consideration of skewness as being a sufficiently reproducible stylized fact, it is assumed that 467 even under conditions of preferential attachment (that produce highly skewed, heavy tailed output 468 distributions), there will exist some internal balancing act (algorithmic) by dating service providers 469 to ensure that-for example-an entire city does not subsequently match with a single user and that no 470 single user is to receive unlimited matches. Thus, our assumption of skewness without extending 471 said assumption to require outputs to be of an extreme value nature seems realistic and sufficient in 472 the absence of additional data.

473
Figures ?? show that male agents received and sent more likes than female agents while female 474 agents received more matches and messages than male agents while sending less messages than 475 male agents. This reproduced collection of facts is directly confirmed by [43]. More male agents 476 populated dating services [11] and also many more male agents were likely to receive no likes, 477 matches, or messages than female agents [44]. This was also reproduced by our reported agent-based 478 model.

479
Focusing our attention on the parameter estimates of our observed networks from M1 and 480 M2-T, we see the effect of gaining additional multiplicity given in the form of an additional feature 481 to send messages with likes. We found that the both M1 and M2-T simulations produced positive 482 estimates for the intercepts and M2-T produced negative estimates for the like and message networks 483 as well as the age difference covariate. One interpretation of the change in sign is that the M1 484 regression model describes a positively increasing regression model where more activity by M1 485 agents (more right swipes) helped agents produce more matches, contrasting M2-T decreasing-slope 486 regression model where M2 agents could not simply increase their activity levels to increase the 487 likelihood of a match. That is-agents engaging in non-strategic activity produced less matches 488 overall in M2-T. The implications are profound: Agents, when presented with additional features, 489 must employ strategies that reflect the availability of features and often these strategies must be more 490 specialized. M2 agents were presented with a random sample of other agents for consideration and 491 their messaging behavior, for the sake of comparison, was also assigned randomly. If we were to 492 assign messaging behavior to be in line with total compatibility scores it would endow M2 agents 493 with strategic behavior, and yet the regression models revealed this as a condition of success without 494 the additional endowment. [45,46] discuss examples of online daters integrating more specialized 495 behaviors in line with what we have produced. 496 Additionally, consider that for the M1 and M2-T models, the ranking of estimates remained 497 invariant to the increase in multiplicity (Table 9). With both M1 and M2-T ranking the like network 498 first concurrent with the messaging network (for M2-T), then attractiveness difference and ethnicity 499 difference, and finally age difference, while simultaneously describing divergent system behaviors 500 where more (random) likes imply a better chance of a match in the M1 system but did not in the M2-T 501 system. We can conclude that while we produced the same systems in terms of effect importance, 502 we also produced a clear divergence in their overall behaviors. Doubtless, effects are transformed 503 when the option for reciprocity through an increased multiplicity is considered in M2-T by order 504 (from positive to negative for example) but not by rank.

505
The overall divergence in aggregate system behavior is further demonstrated by the likelihood 506 of parameter estimates generated by the quadratic assignment procedure and shown in Figure 507 11, clearly demonstrating that most parameter estimates for M1 and M2-T had similar properties 508 excepting the intercept parameter estimate. Furthermore, M1's likelihood estimates were more 509 variable as denoted by a wider functional form.

510
The interpretation of our combined results is based on two key indicators: The first is that the 511 order of importance by coefficient effect size changed between M1 and M2-T-but not by rank 512 order-even when given similar agent rules. The second is that we have successfully reproduced 513 divergent system-level behavior in line with what would be expected from a pure matching market 514 (Tinder) in comparison to a market where specialized strategies (e.g. messaging) can be adopted by 515 agents. Remarkably, we did so by considering only the multiplicity of the applications rather than 516 advanced psychological or sociological theories-a zero intelligence approach.

517
While we have confirmed our hypothesis, we must acknowledge there are limitations to our 518 interpretations and consequently to our approach. The first of which is our use of agent attributes 519 that did not conform precisely to what is known about dating application subscribers. For example, 520 we use age intervals that we chose uniformly from a specified range, and for our main comparison 521 we used ethnic identities that were evenly assigned. It is reasonable to assume that neither of these 522 attribute assignment choices are realistic but were merely convenient. However, these choices were 523 intended to be uninformed [47] in order to ensure that the attributes of agents cannot interfere in our 524 statistical comparisons. Our goal herein was to issue a comparison of the relative size of parameter 525 estimates not to estimate the parameters themselves. If we had chosen non-uniform attributes, then 526 there exists a possibility that model artifacts would disturb our model comparisons. As a result, 527 isolating the results due to a differing multiplicity would be more difficult.

528
While our method does achieve our intended results, in time a better methodological comparison 529 may be possible. It should be noted that a comparison of datasets from two independent collections 530 of observations is trivial to conduct in numerous ways. One need only consider the scientific inquiry 531 in the statistics book of knowledge available since Gosset's t-statistic (1876) to find an overwhelming 532 number of procedures that can be used to compare one set of observations to another. However, this 533 is not what is being considered here; Here we are considering network effects as a representation 534 of one dynamical system and attempting a comparison with a slightly different dynamical system 535 through those same network effects-a less intuitive challenge. Nonetheless, the comparison yielded 536 results and a robust conclusion. 537

538
In this paper, we have shown how agent-based simulations and a robust statistical method of 539 comparison can be used to examine additional features of a dating application. Through this effort 540 we have have discovered that inclusion of even one additional feature can cause divergent outputs 541 and aggregate outcomes. We described this addition as increasing the multiplicity of an online dating 542 application. We argued that this changes a user's personal experience and end-results: In this instance, 543 overall matches and the underlying social effects that govern matching dynamics were transformed 544 and we have emerged the basic properties of Tinder-like and Hinge-like dating application systems 545 in a clear and demonstrable way. The M1 and M2 simulations models legislate clear differences 546 in agent behaviors and in-turn their personal experiences. From this proof-of-concept, bridging 547 our model to large-scale data and real world applications is not improbable and could enhance the 548 design and development of more successful dating applications and matching environments for users. 549 Surely, many who desire love would praise this effort.

550
As we conclude our report, it is most opportune to discuss an inherent assumption in our overall 551 analysis which is best included along with our concluding remarks. This assumption is that "many 552 who desire love", as we have dubbed them, desire quantity of matches as an important output, and 553 seemingly as the output of choice. Consider [48]'s superb analysis of a sample of Tinder users: In 554 this study of over 160 active Tinder users, an exploratory factor analysis reveals that out of six strong 555 categories that explain users' behavior on Tinder, finding "love" was as important as finding "casual 556 sex". In fact, the sub-scale of the analysis openly divulges that "to have a one-night stand" (0.808) 557 exceeded the explanatory strength of "to find a romantic relationship" (0.807). Importantly, from all 558 six and including the four remaining categories (ease of communication, self-worth validation, thrill 559 of excitement, and trendiness), self-worth validation "was the only motivation that was significantly 560 related to higher Tinder use". Minimally, one can safely assume that while quality of match is always 561 an intended target, because dating platforms have become a versatile collection of virtual meeting 562 spaces used in different ways by different users hoping for different outcomes, all said outcomes are 563 centered on the provision of more viable mating strategies-with an emphasis on 'more' rather than 564 'viable'. Not considering quality of matches, thus, can be described as a limitation of our approach 565 only when assuming that "love" is the only important factor governing mating behavior on online 566 platforms.

567
Henceforth, there are many directions we could pursue along this line of inquiry, includ-568 ing calibrating our social effects, models, inclusion of weighted dyadic relationship-and most 569 importantly-using some reference dataset. The latter represents our work's greatest limitation since 570 much of this data is considered private by commercial entities. We hope to be able to pursue these 571 avenues in future papers, nonetheless.