Airbnb Host Scaling, Seasonal Patterns, and Competition †

: This paper explores the scaling (size) effect in the seasonal patterns, a proxy for competitive threats, of Airbnb’s host providers, with the aim of understanding possible similarities and differences. This explorative study uses the city of Milan (Italy) as a case and daily occupancy data from Airbnb listings for four completed years (2015–2018). A mutual information-based technique was applied to assess possible synchronizations in the seasonal patterns. Empirical ﬁndings show progressive dissimilarities when moving from single to multiple listings, thus indicating a differentiation correlated to the presence of managed listings. There are fewer differences during the seasonal periods more centered around leisure clients and they are higher when considering business travelers. The evidence supports the scaling effect and its ability to reduce the competitive threat among different hosts.


Introduction
This paper explores the scaling (size) effect, focusing on Airbnb's host providers, with the aim of understanding the similarities and differences in seasonal patterns. Since the launch of this commercial peer-to-peer accommodation platform in 2007 [1], Airbnb has attracted academic debate, especially in the last few years [2]. Airbnb is a web platform that rents idle assets, called listings (typically rooms, apartments, and houses) that are owned by hosts, to travelers or guests [3].
There are few studies investigating the supply side (host) [3]. Previous papers centered on listing performance focused on the scaling effect [4]. In fact, many studies have distinguished between the host managing only one listing (usually called "mom-and-pop" hosts or simply single-listing hosts) or more than one (usually defined as "professional", "commercial," or multi-listing hosts) [5]. Generally speaking, the two types of host (single versus multiple) depict different results, as discussed in more detail in the next section.
However, knowledge about the managerial differences among these two groups is very limited, despite the ability of the scaling effect to deeply change the hosts' business model [6]. Furthermore, the large majority of these studies simply distinguish between single and multi-listing hosts, without any additional segmentation. Scaling, as usual in managerial studies [7], in this paper refers to the number of listings managed by one host. The higher the number of listings, the higher the scaling effect and the opposite. To contribute to reducing the current gap in the commercial peer-to-peer accommodation platform literature, the present article explores the ability of scaling to change the seasonal patterns to measure the degree of similarity and differences among Airbnb hosts. These similarities and differences are used as a proxy for the competition threat among different (in size) Airbnb listings (as later discussed in detail in the Methodology). According to Butler, the definition of seasonality is "a temporal imbalance in the phenomenon of tourism, (which) may be expressed in terms of dimensions of such elements as numbers of visitors,

Host Scaling Effect
Peer-to-peer accommodation platform literature, despite the rising number of contributions [11], is in its infancy, and many research areas are less investigated [12]. One of them is the qualitative description of the host business model and the scaling effect [13]. For this reason, this paper has analyzed some adjacent but different supply research streams and in particular the impact studies on one hand and the determinants of listings results and pricing strategies on the other.
The impact literature is centered on the effect of Airbnb on hotels [10], tourism destinations [14], and local stakeholders [15], with prevailing attention on housing and long-term rentals [16]. Although the effects on hotels are contradictory [17], the social transformation generated by commercial peer-to-peer accommodation platforms is usually described as relevant. For this reason, the impact studies include a growing area of inquiry exploring the regulation of peer-to-peer accommodation platforms [18]. Despite the importance of the impact research, the focus is usually on the whole effect of the hosts; therefore, the topic of this article (the host scaling effect) is not developed.
A second supply-side research stream has analyzed the determinants of listings results and the pricing strategies [19]. As anticipated in the introduction, this second area of inquiry usually considers the host size as an independent variable that can influence, respectively, the listing results or the pricing strategies. These two distinct sub-topics (performance and pricing) are now discussed. In both groups, many papers (as later presented) distinguish between single-and multiple-listing hosts, also called commercial or professional hosts. The latter (multiple) includes the hosts managing two or more listings.
The determinants of results represent a small research stream that explores the determinants or antecedents of listing performance [20,21], usually operationalized using review volume or rating [22] or, more rarely, occupancy [20]. It is quite a small area of inquiry and is separated in this article from the second, wider, research stream that focuses on pricing strategy. Some studies have considered the number of listings managed by a host as a relevant independent variable. Xie and Mao have found a trade-off between host quality and the quantity of her/his listings. In particular, "as the number of listings managed by a host increases, the performance effects of host quality diminish" [21] (p. 2240). Gunter has investigated the conditions improving the likelihood of obtaining the superhost badge [22]. The author has found four variables, one of them being the status of "commercial" Airbnb host.
Moving to the second sub-topic (pricing strategy), many previous studies have considered the distinction between single-unit and multi-unit hosts [23]. Multi-unit hosts, generally speaking, are described as being more proficient in using a dynamic pricing strategy [24] and in achieving a higher price or revenue (variously operationalized) than a single-unit host [25]. In the theoretical model created by Chen, Zhang and Liu [26], the adoption of a flexible pricing strategy leads to higher performance in a large market, but the accommodation quality is not better. The study of Gibbs et al., reveals that the host's experience positively influences the adoption of dynamic pricing [27]. Realistically, a multi-listing host has more opportunity to improve her/his experience than a single unit host because the host manages a higher number of transactions [28] and, for this reason, they have more skills to manage also new start-ups [29]. In another study, professional hosts achieve a higher price per night. Similarly, professional hosts receive higher rates in rural Switzerland, intermediate (between rural and urban areas), and in cities [30]. These findings are confirmed both using a random and a quantile estimation model [31]. Other authors demonstrate a negative correlation between multi-listing and price but a positive correlation between professional hosts and revenue per month [32].
However, there are some exceptions. For example, [24] focused on five metropolitan areas in Canada. Professional hosts show a positive and significant coefficient with the dependent variable (price), considering all the cases, but the coefficient is negative and insignificant in the case of Calgary. Similarly, in the study of [33], two cities show negative and significant correlations with price, while Madrid shows a positive (but not significant) coefficient. In Hong Kong, multi-listing hosts book at lower price their capacity [34].
These contradictory results can be explained using at least six different arguments. First, multi-listing hosting reduces social interaction with the guests, which is called reciprocity [35], and this can generate a drop in price [36]. Second, the correlation coefficient that tied the multi-listing and rates together is usually small and can therefore suddenly change from being slightly positive to slightly negative [33]. Third, the studies employ different frameworks (i.e., hedonic models, regression, quantile analysis, and artificial neural networks), which can generate different results [37]. Fourth, the relationship can change [38] in different destination contexts [39], also considering the diverse destination positioning and governance [40]. Fifth, the studies use a diverse set of control variables that can influence the relationship between host size and price [31]. Finally, different studies use diverse dependent variables, and sometimes the relationships change [32].
However, as previously stated, the large majority of studies reveal a positive correlation between multi-listings and rates. Curiously, the vast majority of the analyzed studies, with the exception of [32], have operationalized the scaling effect only by distinguishing between single and multiple hosts, without any additional segmentation. In other words, a host managing two listings is considered similar to a host renting 50 or more listings.

Milan Seasonal Patterns
This section explores the Milan seasonal patterns. The city is the economic capital of Italy and previous studies have identified three main market segments attracted by Milan: (i) business, (ii) trade-fair, and (iii) leisure [41]. Each segment is characterized by a clear seasonality [42]. During weekdays, the business target is prevalent, while during weekends the leisure is the main market segment [43]. Some previous articles introduced the distinction between "working days" and holiday (or "non-working days") [44]. The first group (working days) includes, in line with the study of [45], the weekdays not affected by religious (such as Christmas and Easter) or civil holidays (as Republic Day or Labor Day). The opposite is for holidays, that include the weekends and all the religious and civil holiday periods. During holidays, the leisure clients are prevalent, while the business is the core target of working days [46]. Finally, Milan is a leading European city for exhibitions. When the local trade-fair center (Fiera Milano) organizes some top events, the hotels achieve top performance in both occupancy and revenue. For this reason, this study included these top events that are mainly business-to-business exhibitions able to attract a large international audience.

Hypotheses Development
In this section, some previous insights related to both the number of listings managed by a host and Milan's seasonal patterns are considered as formulating two different groups of hypotheses, which guide the empirical analysis.
The first group focuses on the scaling effect and explores the degree of synchronization (the similarity) of the five groups of hosts. The precise meaning of synchronization is described in the methodology section. However, in order to perceive the meaning of the proposed hypotheses, a qualitative explanation is anticipated. The synchronization (as the name itself suggests) evaluates the similarities (differences) in time series [47]. This paper explores whether the scaling effect is able or not to change the synchronization degree among different (in size) groups of hosts. Put differently, do small and big hosts show similar series or does the scale differentiate them?
The hosts are segmented into five groups. As discussed (Section 2.1), previous studies usually distinguish only between single and multiple listings. However, some recent studies adopted a more fine-grained classification for multiple listings [48]. In line with these last papers, the current article distinguishes between: (i) a mom-and-pop host (single listing); (ii) a host renting two listings; (iii) a host selling three listings; (iv) a host managing four to 10 listings; (v) a host renting more than 10 listings. The five groups represent three different scaling effects. Logically, a host managing one to three listings can organize her/his business without (or by limiting) the employment of external workers. Four is assumed as the threshold for moving from a personal to a more professional business model, where professional means the involvement of external collaborators [48]. Finally, as suggested in another study [32], more than 10 can represent new, important scaling, which can favor more specialization and professionalization in the main business functions (selling, housekeeping, customer relationship management, and information technology). In this paper, the scaling effect is considered and can, therefore, improve the host's knowledge and managerial skills. For this reason, the following two hypotheses are formulated.

Hypothesis 1.
A rise in the number of listings progressively reduces the synchronization degree among the five groups.

Hypothesis 2.
A rise in the number of listings progressively reduces the synchronization degree between each group and the overall (sample) mean.
The second set of hypotheses focuses on Milan's seasonal patterns. As previously explained, Milan shows a strong demand fluctuation between the holidays, the weekends, and the days without trade-fair events compared with working days, midweek and days with trade-fair events [9]. Many previous studies agree that Airbnb listings are mainly specialized and categorized as leisure [49]. Therefore, Airbnb listings are expected to be more proficient when leisure clients are more relevant (holidays and weekends) as well as when the city hosts trade-fair events (many trade-fair guests combine business and leisure). In another words, when the key target of Airbnb is prevalent (leisure) the differences among the five groups of hosts (based on their size) are less nuanced. By contrast, when the key target of the city is business, reasonably smaller hosts are less skilled to serve this target and, therefore, show different seasonal patterns (and therefore less synchronization degree) than bigger (scaled) hosts. Therefore, the following three hypotheses are proposed.

Hypothesis 3.
The synchronization degree between the five groups and the total is higher during: 3A-the holiday period than the working period; 3B-the weekend period than the midweek period; and 3C-the trade-fair period than the non-trade-fair period.
The scaling effect should progressively increase the multi-hosts' ability to serve the three main Milan segments (leisure, business, and trade-fair guests) differently. In fact, Eng. Proc. 2021, 5, 4 5 of 13 these diverse targets have different needs, seasonal patterns, performance metrics and require diverse host's skills and services. Therefore, in the six seasonal periods (holiday and working; weekend and midweek; trade-fair and non-trade-fair events), the scaling effect is expected to show progressive desynchronization patterns compared with the five groups. The following six hypotheses are formulated.

Hypothesis 4.
The synchronization degree among the five groups progressively reduces during: 4A-the holiday period; 4B-the working period; 4C-the weekend period; 4D-the midweek period; 4E-the trade-fair period; and 4F-the non-trade-fair period.

The Sample
This study has chosen the city of Milan due to its prevalent focus on business and tradefair clients on one side but in association with its non-marginal presence of leisure travelers on the other. Previous papers that explore the effects of Airbnbs in Europe are mainly focused on large leisure cities, such as Barcelona [50] and Venice [51], or mixed leisure and business destinations, such as Madrid [52], Paris [53], London [54], and Berlin [55].
To explore the scaling and seasonal patterns of Airbnb listings, AirDNA data were used by the research team, which cover the period of 2014 (from November) to June 2019. Therefore, the data include four completed years (2015-2018) and support a longitudinal analysis, in line with some previous studies [56]. Many previous papers have used AirDNA data [4]. To test the hypotheses, daily data were used, as in other similar studies [57]. AirDNA considers the available and sold listings as well as the price for each day and for each listing. The sample includes all Milan's population represented by more than 50,000 listings.

The Host Segmentation
As anticipated in the section dedicated to the hypotheses' development, the 31,000 listings in Milan are classified into five groups and consider the number of rented listings (one, two, or three, from four to 10, or more than 10). The segmentation is based on the difference skills and competences required to manage the increasing number of listings and the business organizational complexity and it is in line with some previous studies [48]. In this section, additional quantitative data are considered to test the validity of this segmentation. Figure 1 reports the host and listing distribution, which shows a clear power-law pattern. The graph illustrates the long tail with a strong concentration on the right side of the horizontal axis. Essentially, a handful of hosts manages a wide number of listings.     Table 1 reports the descriptive statistics of the five groups; it is structured in five different vertical sections. The first depicts the absolute metrics, and the second shows the relative measures. The first cluster includes 78% of the hosts, but only 48% of the listings, which generate 35% of the total revenue. Focusing on this latter figure (revenue), there is a good division among the remaining four groups: The second cluster is 13%, the third is 7%, the fourth is 15%, and the last is 29%. The small size of the third cluster in terms of listings (8%), hosts (4%), and revenue (7%) appears coherent to the managerial description. In fact, this group reasonably represents the breaking point of the "personal" business model, which is centered around the work and the competencies of the host. The unitary values (third column) show the rising ability of the bigger hosts to book a higher number of days, moving form 79 (cluster 1) to 117 days (cluster 5). Generally speaking, the scaling generates approximately an augment of 10 additional booked days moving from one cluster to the following. The penultimate column depicts the operating performance indices as occupancy (booked days divided by available days), the average daily rate (ADR, revenue divided by booked days), and the revenue per available night (RevPAN, revenue across available days). Focusing on the revenue per available night, the scaling effect is associated with a progressive rise of this value. The last column reports the variation of the performance metrics moving from the first to the last cluster. The revenue per available night shows an impressive growth, rising to 3.1% (from the first to the second group), 38.8% (from the second to the third), 8.2% (from the third to the fourth), and 46.7% (from the fourth to the fifth), respectively.

The Method
As anticipated, this paper evaluates the synchronization degree comparing the five groups of hosts, in order to perceive the similarities and differences. The method developed by Cazelles has been adopted [47]. It requires three different steps, which are introduced and described below [58].
The first phase transforms the series (values) in a set of symbols, by comparing each value with its neighbors'. As reported in Figure 2, there are some possible cases: (i) trough point, (ii) peak point, (iii) increase, (iv) decrease, (v) stability. These five trends are then observed comparing couple groups of different hosts (in terms of size).
An example of the five situations is reported in Figure 2.
troduced and described below [58]. The first phase transforms the series (values) in a set of symbols, by comparing each value with its neighbors'. As reported in Figure 2, there are some possible cases: (i) trough point, (ii) peak point, (iii) increase, (iv) decrease, (v) stability. These five trends are then observed comparing couple groups of different hosts (in terms of size).
An example of the five situations is reported in Figure 2. The second phase is the heart of the analysis and calculates the mutual information degree. It is a quantitative method that compares two series and evaluates the degree of similarity (synchronization) or dissimilarity (de-synchronization). "Given the series X and Y, the mutual information I(X,Y) is calculated as: where H( ) is the entropy of each series: The second phase is the heart of the analysis and calculates the mutual information degree. It is a quantitative method that compares two series and evaluates the degree of similarity (synchronization) or dissimilarity (de-synchronization). "Given the series X and Y, the mutual information I(X,Y) is calculated as: where H( ) is the entropy of each series: and H(X,Y) is the joint entropy of the two series: We then normalize the mutual information using: It is easy to demonstrate that if X and Y are independent random variables, then: therefore, the "mutual information is zero" [48] (p. 5). To calculate these quantities, Python scripts adapted from at https://github.com/people3k/pop-solar-sync (last accessed April, 2021) were used. The last phase evaluates the statistical significance of the values U(X,Y). In line with previous studies [47] 500 surrogate pairs of series were created, based on a Markov process (with a one time-step memory), that preserve the structure of the series [48]. Finally, the five groups of hosts were compared to the corresponding surrogate series and a t-test was used for evaluating the statistical significance.

Findings
The findings are structured in two sub-sections. The first tests the hypotheses focused on the scaling effect, while the second explores the seasonal patterns. Table 2 reports the findings related to the first (synchronization degree among clusters) and the second hypothesis (synchronization degree between each cluster and the overall sample). In both hypotheses, the rise of listings is expected to reduce the synchronization degree. As explained in the methodology section, the synchronization is measured by the mutual information. The higher the value of the mutual information score, the higher the similarity and vice versa. A ratio of 0.40 identifies a good similarity (or synchronization), while a value lower than 0.20 depicts a strong dissimilarity or desynchronization [59][60][61].

Scaling Effect
The following columns are based on the comparison between the different clusters and the 500 random series. The evidence reported in the first five columns (from P1 to P > 10) is used to test the first hypothesis. The mutual information of the first cluster (P1) shows a progressive reduction comparing the single host with cluster P2 (0.498), P3 (0.429), P10 (0.363), and P > 10 (0.334). The other column shows the same pattern. Therefore, the evidence confirms the first hypothesis that the synchronization among the different clusters reduces as the the number of managed listings rises.
Moving to the second hypothesis, reading the values of the last line is sufficient. In fact, the values show a very strong synchronization for the first cluster (0.665), but the mutual information progressively reduces, moving to P2 (0.560), P3 (0.499), P10 (0.454), and P > 10 (0.395). The second hypothesis is confirmed, and it implicitly confirms that the overall sample (PAll) is largely influenced by the first three clusters, which together represent the 95% of hosts, 72% of listings, and 55% of total revenue.

Seasonal Patterns
The analysis explores the seasonal patterns characterizing the chosen destination. The hypothesis focuses on the synchronization degree between the five groups and the total (PAll) comparing the opposed seasonal patterns: holiday and working; weekend and midweek; trade fair and non-trade fair. The values are reported in Table 3. The values should be read while comparing each vertical couple for each cluster. If the synchronization degree reduces (for each cluster and for each of the opposed seasonal periods), then the three hypotheses are confirmed. Focusing on Hypothesis 3A, the cluster P1 moves from 0.649 (holiday) to 0.643 (working); P2 from 0.553 to 0.505; P3 from 0.521 to 0.438; P10 from 0.434 to 0.412; and P > 10 from 0.403 to 0.331. The values confirm Hypothesis 3A, which means that each cluster is more synchronized with the overall sample during the holiday period rather than the working days (when business is dominant). This evidence confirms the prevalent specialization of Airbnb listings for leisure rather than business guests. These results can be extended to the second (Hypothesis 3B) and third (Hypothesis 3C) seasonal periods. During the last seasonal period (trade-fair), when Milan hosts some events, the synchronization degree registers as the highest value in all clusters (for example, is 0.860 for P1). Finally, Table 4 reports the relationships among the five clusters during the different seasonal periods in order to test the last six hypotheses, according to which the synchronization decreases when the scaling effect rises. Table 4 contains six panels-one for each seasonal period. Hypothesis 4A focuses on holiday. Reading the table by column, cluster P1 shows a progressive reduction in the mutual information moving from top (0.47) to down (0.35). Generally speaking, all the values show this trend with very few exceptions (three out of 60) identified by the squared cells in Table 4. Therefore, the evidence largely confirms the six hypotheses.

Discussion
The research question of this paper focuses on the relationship between the scaling effect and the seasonal patterns. The latter are used as a proxy for the competition among different (in terms of size) Airbnb hosts. Based on the findings previously shown, this section discusses the main results. Focusing on the overall (annual) seasonal patterns (Hypotheses 1 and 2), the data confirm that the scaling effect increases the dissimilarities between the hosts managing a few and many listings, respectively. Realistically, this evidence supports a progressive competitive reduction among big and small hosts.
The second set of Hypotheses 3 and 4 move from the whole (annual) patterns to the specific seasonal periods characterizing the destination under study. Generally speaking, the synchronization degree is higher during the holiday and weekend periods, confirming the prevalent specialization of Airbnb listings for leisure clients. However, the mutual information degree registers an important drop moving from single to multi-listing hosts, suggesting, also in this case, different (or partially different) business models. By contrast, during the working and especially midweek periods, the synchronization is lower, and the dissimilarities augment when comparing mom-and-pop hosts with large multi-listing providers. Therefore, the progressive reduction in the competition appears more relevant. These results can be extended to the trade-fair and non-trade fair seasonal periods.

Conclusions
The conclusions are articulated in four sections: Some theoretical, as well as practical, implications are traced, future research avenues are proposed, and some study limitations are identified.

Theoretical Implications
As discussed in the introduction and in the literature review, the current studies largely distinguish only between single-and multi-listing hosts, ignoring the magnitude of the scaling effect. The findings proposed in this study depict a very different picture, showing a progressive differentiation in the seasonal patterns correlated to the rise in the managed listings. The results, therefore, can significantly change the theoretical knowledge in this field and can re-orient future studies, especially in the sub-field of competition, the determinants of listing performance, and pricing strategies.
Second, the synchronization degree among the different (in scale) hosts is not homogenous but changes according to the different seasonal patterns of the Milan destination. The higher the specialization in leisure segments, the higher the competition among the different hosts; the higher the specialization in the business segment, the lower the similarities and, therefore, the competitive pressure.
Finally, this paper introduces important innovations to analyze the competition among Airbnb listings. The first innovation is to clearly identify the main destination market segments (in the case of Milan, leisure, business, and trade-fair guests) and the corresponding seasonal patterns. The second methodological innovation is the use of mutual information to perceive and measure the degree of synchronization between the series, variously articulated to measure the scaling effect and the identified seasonal periods. This approach can open new research opportunities in other destination contexts.

Practical Implications
This paper sheds new light on the competition threat among Airbnb listings considering their scale. In particular, the findings clearly suggest a progressive reduction in the similarity of seasonal patterns when the size of the host, measured by her/his listings, rises. Therefore, the results support identifying different groups in the Airbnb arena that have a diverse competitive threat according to the specific seasonal period. Therefore, a single host, according to her/his scaling effect, can create a different competitive set. Furthermore, the competitive intensity varies according to the specific seasonal period and appears to be higher when the attracted segments are leisure, which reduces in the case of business guests.

Research Avenues
The findings reported open many new research opportunities. Some of them are discussed in this sub-section. It is surprising that the current peer-to-peer accommodation platform literature has completely omitted any studies qualitatively exploring the business models of mom-and-pop and multi-listing hosts. Future research must cover this gap, identifying the advantages and disadvantages of moving from a limited size to a bigger scale. This qualitative research can shed light on the main resources and competences that can be stretched as the scale rises.
A second interesting area of inquiry can explore the competitive threat between Airbnb listings and hotels. In particular, based on the current findings, this research area can explore if the listing scaling augments the synchronization degree between Airbnb and hotels, thereby increasing the competition and the substitution threat. Furthermore, the competition intensity can be articulated considering the different seasonal periods characterizing the destination under study.
A third research avenue focuses on the studies exploring the determinants of performance and pricing strategies. As analytically discussed in the literature review, the research standard, with very few exceptions, is to segment the hosts into single-or multilistings. Based on the current results, future research should investigate more analytical segmentation and consider different relevant seasonal periods. In fact, the determinants of the results and rates can change consistently.
Finally, future studies can employ new methods for testing the hypotheses (as Bayesian Hypothesis Testing) or SARIMA (Seasonal Auto Regressive Integrated Moving Average) for the seasonal patterns.

Limitations
This is an explorative paper, which is in line with similar previous studies focused on competitive threats [62]. It is centered around a single case study. The findings' generalization is partially limited. However, the paper adopts a longitudinal approach, creating a consistent temporal pattern. A future research agenda is called for to verify whether, within a multi-destination study, the evidence reported is confirmed.
Author Contributions: R.S. wrote the introduction, the literature review and the findings and was responsible of formal analysis; methodology; project administration; supervision; visualization; roles/writing. R.B. developed the statistical analysis and wrote the methodology and was responsible of data curation; funding acquisition; investigation; software; validation. Both authors discussed and contributed to the conclusions. Both authors have read and agreed to the published version of the manuscript.