From the Hands of an Early Adopter's Avatar to Virtual Junkyards: Analysis of Virtual Goods' Lifetime Survival

One of the major questions in the study of economics, logistics, and business forecasting is the measurement and prediction of value creation, distribution, and lifetime in the form of goods. In"real"economies, a perfect model for the circulation of goods is impossible. However, virtual realities and economies pose a new frontier for the broad study of economics, since every good and transaction can be accurately tracked. Therefore, models that predict goods' circulation can be tested and confirmed before their introduction to"real life"and other scenarios. The present study is focused on the characteristics of early-stage adopters for virtual goods, and how they predict the lifespan of the goods. We employ machine learning and decision trees as the basis of our prediction models. Results provide evidence that the prediction of the lifespan of virtual objects is possible based just on data from early holders of those objects. Overall, communication and social activity are the main drivers for the effective propagation of virtual goods, and they are the most expected characteristics of early adopters.


Introduction
Virtual worlds and games have been postulated to provide unprecedented possibilities for research in general [1,2], but especially for the study of economics [3] due to their ability to systematically track every event in that reality, but also due to the possibility of creating controllable environments while having people exhibit natural behaviors.
Perhaps one of the most prominent veins of study related to virtual economies has been the study of consumer behavior related to adopting and purchasing virtual goods in virtual worlds and games [4][5][6][7]. This has especially been the case since games and virtual world operators have been the forerunners in implementing the so-called freemium or free-to-play business model ( [8][9][10]), where playing or using the virtual environment is free of charge, but the operator generates revenue through different manifold marketing strategies combining classical sales tactics imbued with platform design that further encourages virtual-goods purchases [11][12][13].
Virtual goods mostly take up the forms of in-game items related to the theme of the game, such as avatar clothing, gear, vehicles, pets, emoticons, and other customization options [5,14], as well as different types of items related to the recent proliferation of "gamblification", where acquiring virtual www.mdpi.com/journal/applsci goods is increasingly based on gambling-like mechanics, effectively blurring the line between gaming and gambling [15].
The largest vein of research in this continuum has been the investigation into why people purchase virtual goods [4,5] in primary or secondary markets within the virtual world. Popularly, this question was initially motivated by the sheer anecdotal amazement of why people would spend considerable amount of real money on products that "do not exist" [11,16]. However, since the initial combination of hype and disillusionment, virtual and game economies have entered into the realm of everyday consumer-facing services. Studying the question of why people purchase and trade virtual goods has primarily focused on latent psychological factors such as motivations, attitudes, experiences, and belief, and how they predict virtual-goods transactions as well as the internal design of the environment (see, for example, Reference [4] for a review of the area). However, the limitation within this sphere of research is that it can only provide a glimpse of the reasons why users purchase virtual goods as a singular event since it is focused on the consumer rather than the object of consumption and trade-the virtual good itself. Only few studies [17] have taken the initiative in an attempt to map the longer lifespan of virtual goods from their inception to circulation and to their ultimate end, destroyed from the virtual world, forgotten in a user's virtual bag, or existing in an account of a user who has stopped visiting the virtual world.
Additionally, one of the major hurdles in governing and maintaining virtual economies, in addition to increasing consumer demand for virtual goods [11], has been the balancing act between "sources" and "sinks" [18] of virtual goods within a virtual economy. There is no practical or technical reason why any virtual good could not exist in complete abundance within the virtual economy. However, this would create problems both in relation to the meaningfulness of acting within the virtual world due to extreme inflation, which would also effectively void any need for users to purchase or trade virtual goods. Therefore, the lifetime management of virtual goods is of vital importance for any virtual-economy operator (see References [6,11,18]). Some of the methods in the game-operator palette have been, for example, contrived durability and planned obsolescence of virtual goods (see, for example, Reference [19]).
Game developers are confronted with issues identified with the ideal recurrence of virtualproduct updates, their volumes, and intensity, with an emphasis on ceaseless development [20]. Reduced recurrence of updates can result in user churn, while the consistent improvement of new content increases operational expenses. From another perspective, users may have a constrained capacity for digital content used when content is updated as often as possible. This might be regarded as unwise budget allocation when content production is fundamentally higher than demand. The life expectancy of web-based gaming items is generally shorter than that of traditional items, and users always expect system updates and new content [21,22]. Another issue is the habituation impact resulting from the short life expectancy of virtual products, and the limited time in which the item can attract online users. This opens up new research directions since, so far, it has principally been researched for traditional markets [23].
To address this research problem, the present study is focused on the characteristics of earlystage adopters of virtual goods and how they predict the lifespan of the goods. Rogers [24] treats 2.5% of users as innovators, 13.5% of users as early adopters, 34% as an early majority, and 34% and 16% as the late majority and laggards, respectively. This research shows how characteristics of early-stage adopters affect user engagement and product lifespan. The main contributions include the identification of the role of early adopters of virtual goods for product lifespan, and building a predictive model for product life with the use of data.
The empirical study is followed by analysis based on survival prediction models and identification of the role of the characteristics of early-stage adopters for product lifespan. Decision trees showed the ability to predict product lifespan with the use of product-adopter characteristics. The rest of the paper is organized as follows. The Methodology section contains the conceptual framework, dataset description, and methodological background. The Results section includes descriptive statistics and results from the lifespan models based on user characteristics. This is followed by results from product classification in terms of their lifespan and user characteristics with an accuracy higher than 80%. The study is concluded in the final section.

Research Questions and Study Design
The presented study assumes the ability of virtual-product survival prediction with user attributes, especially those interested in the product at different stages of the product lifecycle. This research is based on the conceptual framework presented in Figure 1. A set of virtual products, Pi, was introduced to the audience of a social platform. Behaviors related to user engagement and products usage were collected. The node position within the example social network is represented by node size. Small circles were used for low degree nodes with one connection through medium sizes up to biggest ones for nodes with four connections. In general, user characteristics can represent various attributes related to network centrality and activity within the system like communication frequency and intensity of platform usage. They create parameters space with m distinguished variables assigned to each user in the form of vector V = [V 1 , V 2 , ..., V m ]. Users adopted to each product can be divided into five adoption groups with 2.5% of users interested in product distinguished as innovators, next 13.5% classified as early adopters, 34% as early majority, 34% of late majority, and users adopting to product at the end (laggards) as 16% of all adopters. (I) Analytical system integration with the platform with the ability to detect the characteristics of users engaged in a new product, and the stages when adoption takes place; (II) product classification according to survival time and audience characteristics; (III) monitoring the performance of new products, predicting their usage, and additional audience targeting.
One of the research questions is whether innovator and early-adopter characteristics can affect product lifespan. It would be possible to identify the characteristics of initial users and, in cases of lowperformance prediction, the interest of users with central network positions could be increased by sample delivery, trial accounts, or other incentives. As a result, followers and late adopters would be influenced and motivated for new-product usage.
Three exemplary possible scenarios are presented. In Scenario 1, Product P1 is introduced, and two users, namely, UP 1,1 and UP 1,2 , with high positions represented by red nodes adopt the product as innovators. They are followed by users adopted at several stages of the product lifecycle, and it is considered a successful product launch, boosted by top-user engagement. The campaign is characterized by high dynamics D 1 and long product lifespan L 1 . Scenario 2 for Product P2 assumes three innovators, UP 2,1 , UP 2,2 , and UP 2,3 , characterized by medium metrics within a social network, and this is represented by orange and blue nodes. They build the interest of other users in the launched product, and overall campaign evaluation results in medium dynamics D 2 and product lifespan L 2 . Scenario 3 assigned to Product P3 is based on the interest of innovators with the lowest network positions. It results in dynamics D 3 and lifespan L 3 . In an analytical system, historical product data are used to analyze the influence of user characteristics, especially innovators and early adopters, on the product's lifespan and engagement among other users. This is based on three stages of data processing. In Stage (I), the characteristics of adopters from all groups are measured. In Stage (II), classification is performed to build class descriptors of users who are characteristic for a product with different survival time. Results are used to build a knowledge base and rules set for further use within the system and future product evaluation. In the next stage, new product Px is launched and introduced to the system. Innovators and early adopters were monitored, and prediction of the product lifespan was performed. If the product that is assigned to the class with possible low lifespan, actions to improve performance can be implemented by the selection of users with high network positions to build interest in the new product, denoted as red arrows. The main goal is to increase the dynamics of product consumption D x and its lifespan L x . In practice, it can be performed by product samples, trial accounts, or various other forms of incentives.

Dataset Description and Participants
The experimental study is based on data from the virtual world and the use of avatars within the platform [25,26]. The introduced dataset covers information from 195 items included in the form of user avatars. Items are utilized in the virtual-world platform providing various forms of entertainment and chat functions. Graphical symbols represent users who all have the chance to participate in the life of the online network, with 850,000 accounts initiated. Clients interact in the space of public graphical rooms that are related to various themes. They can configure and supply their private rooms and also utilize web-based games and unique entertainment alternatives.
The fundamental functions of the service are related to chatting, meeting new people, communication, and creating social relations. Other features include clothes and virtual products, styles, avatars, and a decorative element. New-product information can be distributed through private messages, sent through the use of an internal communication system. The analytical module concentrated on new items and this enhanced monitoring of content distribution and collecting information related to data-dissemination procedures. Clients accessed various amounts of functions that are commonly available, and also paid for premium services, which provide more potential outcomes. Virtual products appear in the form of products equivalent to real goods, special effects for avatars, or avatars themselves. Account extensions used within the system had different characteristics and purposes. For example, animations, flashing elements, and active objects handled by avatars were used.
While innovation-diffusion theory emphasizes the role of innovation characteristics, it was important to take into account objects with similar characteristics to minimize the impact of individual product features and the level of innovation. This led to analysis of comparable static-avatar elements with similar characteristics without special effects usually attracting more attention than static objects.

Survival Analysis Methods for Measuring Product Lifespan
The presented study uses survival analysis to analyze the expected time duration when interest in new products exists, which represents the product lifespan. In the field of survival analysis length of time taken is referred to event time [27]-product usage time in our case. Survival analysis was originally developed in the medical field, as a means of analyzing the time between medical intervention and death. Over the past few decades, the field was expanded to include other events as well as events that occur multiple times for a given individual [28].
Survival analysis has wide applications in the field of marketing, including customerrelationship management (CRM), marketing-campaign management, and trigger-event management [29]. If we denote the time taken for an event to occur as T, we can construct a frequency histogram and model a series of events as a function of time. The probability distribution function for T can be denoted by f (t). The cumulative distribution function can be denoted by F(t). This provides the following equation: Using the above approach, we can represent survival as a function of time S(t) such that: for t = 0, S(t) = 1 for the specific time that a failure occurs, the value of S(t) is zero [30]. In some cases, the time to failure will not be observable and only partial observation will be possible. In this case, we consider a specific 'censoring time' c. The survival function is then denoted as: Instantaneous hazard or conditional failure rate is the instantaneous rate at which a randomly selected individual-who is known to be alive at time (t 1) and will die at time t [31]. Mathematically, instantaneous hazard is equal to the number of failures between time t and time t + D(t), divided by the size of the population at risk (at time t), divided by D(t). This gives us the proportion of the present population at time t that fail, per unit of time, represented by the equation: Dt!0 D(t) Widely used, the Kaplan-Meier method is used to estimate time-related events [27]. Most commonly, it is used in biostatistics to analyze death as outcome. However, in more recent years, the technique has seen adoption in the fields of social sciences and industrial statistics. For example, in economics, we might measure how long people tend to remain unemployed after being let go by an employer; in engineering, we might measure how long a certain mechanical component tends to last before mechanical failure takes place. The survival function is theoretically a smooth curve, but it can be estimated using the Kaplan-Meier (KM) curve. Plotting the Kaplan-Meier estimate entails a series of horizontal steps of declining magnitude that, for a sufficiently large sample approach, estimate the true survival function for the given population. When applying this approach, survival-function value between successive sampled observations is presumed constant [32]. An important advantage of the Kaplan-Meier curve is its ability to take into account censored data loss within the sample before the final outcome is observed. In cases where no truncation or censoring occurs, the Kaplan-Meier curve is equivalent to empirical distribution [33,34].
As mentioned, survival analysis has wide applications for marketing, including CRM, marketingcampaign management, and trigger-event management [29]. Depending on the business setting, e.g., contractual versus noncontractual, different techniques can be applied [29]. For example, a goal might be to analyze the performance of a marketing campaign (while in progress), and how different customer features affect its performance. In this case, recurrent survival analysis techniques are used and the hazard function models the tendency of customers to buy a given product [35,36] Survival analysis also has wide applications in the field of customer-behavior analysis. Among other things, it has been used to make predictions regarding customer retention in the banking [37] and insurance industries [38], credit scoring (with macroeconomic variables) [39], credit-granting decisions [40], and risk predictions of small-business loans [41].
Aside from customer behavior, survival analysis has been used to make predictions regarding the survival of online companies [42], as well as the duration of open-source projects [43]. Similarly, product survival in given markets was analyzed with network effects based on product compatibility [44].
The advent of digital marketing has provided additional streams of rich behavior data and subsequently new fertile ground for the application of survival analysis. With these data, survival analysis can be used to make predictions regarding the survival of music albums and distribution [45], the survival of mobile applications [46], as well as e-commerce recommendations to users [47].
For social platforms, survival analysis has been applied to triadic relationships within a social network [48], as well as participation in online entertainment communities with the use of entertainment and community-based mechanisms [49]. Player activity in online games provides valuable data for analysis, with a focus on game hours, subscription cancellations [50], and the adjustment of game parameters. In this context, a primary goal is to achieve the optimal user experience in terms of game speed and design [51].
Another area that is being explored is churn prediction in mobile games using survival ensembles [52] and player-motivation theories [53]. While game-time survival analysis can be used as a predictor of user engagement, it can also provide knowledge regarding factors that affect gameplay duration [54]. Similarly, it can provide insight in how player activity and popularity affects retention within games [55]. It can also be used to uncover predictors of game-session length, such as character level or age within the game [56]. The ability to quantify user satisfaction provides greater ability to target user needs [57].

Classification Methods Used for Product-Lifespan Prediction
Decision-making involves several approaches, including decision-tree classifiers [58]. Making a decision based on the structure of a decision tree allows complex decisions to be broken into a few small ones to deeply understand a problem. Decision trees are pervasive in a variety of real-world applications, including and not limited to medicinal research [59], biology, credit risk assessment, financial-market modeling, electrical engineering, quality control, biology, chemistry and so on. The evolution of web applications and social media resulted new areas of decision support and data analytics focused on user interaction and online behaviors. Decision trees are used for e-commerce, social media, online games, player segmentation, and other areas. Among other areas, applications include decision-tree usage for the future adoption of e-commerce-service predictions [60]. In social media, decision trees are used, for example, to predict the distance between users with Twitter activity data [61] and Twitter message classification with the use of the Classification and Regression Tree (CART) algorithm [62]. This wide area of applications includes online games with a focus on player-segmentation strategies based on selfrecognition and game behaviors in the online game world to improve player satisfaction [63]. Integrated data-mining techniques such as association rule discovery, decision trees, and self-organizing map neural networks within the Kano model are used for customer-preference analysis in massively multiplayer online role-playing games [64].
Predicting aspects of playing behavior with the use of supervised learning algorithms is trained on large-scale player-behavior data. Decision-tree learning induces well-performing and informative solutions [65]. Rule databases can be used in a form of rule reasoner in online games for the detection of cheating activities [66], while a case-based reasoning approach can be applied for the purpose of training our system to learn and predict player strategies [67]. Educational games can be improved with decision trees used for the identification of factors affecting user behavior and knowledge acquisition within educational online games [68]. In other applications, decision trees are used for Internet game addiction in adolescents [69] and game-traffic analysis at the transport layer [70].
Clusterization techniques are used for player-behavior segmentation in computer games with the use of K-means and simplex volume maximization clustering [71], and user segmentation is used for retention management in online social games [72]. Integrated data-mining and experientialmarketing techniques can be used to segment online-game customers [73].
Owing to their structure, trees are easy to interpret, and hence result in better insights to problems. Nodes in decision-tree ramify from root nodes, and each node represents a condition related to a single input variable (feature), each branch represents a condition outcome, and each leaf node represents the class label. In this study, we applied CART [74], which is a binary tree. The method is to generate binary-tree-utilized binary-recursive partitioning that divides the dataset into two subsets, as per the minimization of a heterogeneity criterion computed on the resulting subsets. Each division made is based on a single variable, and some variables may not be used at all, while others may be used several times. Each subset is then further split based on independent rules.
Let's take into account decision tree T, with one of its leaves t . T is a mapping that assigns a leaf t to each sample (X 1 i , . . . , X i p ), where i is an index for the samples. T can be viewed as a mapping to assign a value Yb i = T (X 1 i , . . . , X i p ) to each sample. Let p(jjt) be the proportion of a class j in a leaf t .
The Gini index and entropy are the two most popular heterogeneity criteria. The entropy index is: with, by convention, xlogx = 0 when x = 0. The Gini Index is an impurity-based criterion that measures divergence between the probability distributions of the target attribute's values [75]. The Gini index is defined as: For the purpose of our research, we followed the formal definitions proposed by Maimon and Rokach [76], with bag algebra in the background [77]. The Cartesian product of all input-attribute domains and target-attribute domain defines the universal instance space, i.e., U = X dom(y). Training consists of a set of tuples. Each tuple is described by a vector of attribute values. The training set is denoted as S(R) = (hx 1 , y 1 i, . . . , hx n , y n i) where x q 2 X and y q 2 dom(y). The algorithm needs these data to learn how to match the input variables with the dependent variables-briefly, how to fit into the algorithm.
The test dataset was used to verify how our algorithm learns from the training data by checking its classification accuracy. We achieved this through matching classified observation with a realobservation class.

Descriptive Statistics
Statistical analysis was based on 195 elements divided into four types of virtual elements, E1, E2, E3, and E4, used within the system representing avatar head, body, legs, and shoes. The data contain the anonymized behavioral patterns of 8139 unique users. The analyzed products were introduced to system users within 21 content updates (CUs).
In order to perform statistical analysis, we used two groups of separate variables related to user activities. Variable abbreviations and their explanation can be found in Table 1. innovators + early adopters+early majority+late majority+laggards The first group includes five variables treated as Activity Factors with the symbols CA-AA. These are, respectively: CA, communication activity represented by an average number of messages received by users adopting the product divided by the number of logins; SD, social dynamics, represented by an average of a number of friends of the product adopter divided by the number of logins; CP, communication popularity, represented by an average number of outgoing messages divided by incoming messages; SP, social position, represented by the average number of received messages divided by the number of incoming messages; and AA, adoption activity, represented by averaging the number of new avatar-element usages divided by the number of logins.
The second group of variables represents Experience Factors related to user activity since account creation, such as MSG_in, the average number of all messages received by the user until the avatar changes; MSG_out, the average number of all messages sent by the user until the change; MSG_total, the average number of total messages sent and received by the user; FR_inthe number of unique friends contacting the user until the avatar change; FR_out-the number of friends contacted before the avatar usage, and FR_total, the average total number of friends.
For each product, users were assigned to Adoption Groups in five classes: innovators, early adopters, early majority, late majority, and laggards, according to time of adoption.
For the purpose of determining the role of used variables, user-related factors were used for the statistical models of survival analysis. We took into account the User Activity and User Experience factors. Initial analysis showed that, for most products, survival time was shorter than one month, and only few of them reached nearly three months. To cover usage periods with more detail, five time periods were taken into account during analysis: one week, two weeks, one month, two months, and three months. One week as the shortest period makes it possible to analyze behavior each day of the week after product launch. Analyzing the statistical significance of predictors that influenced the lifetime dependent variable, we can see that mean CA and AA showed statistical significance of p < 0.05 for all periods. The CP variable, on the other hand, is one that has no effect and is not relevant in any given period. Separately analyzing each period, we can see that the periods of one month, two months, and three months showed the significance of the CA, SD, SP and AA variables. Wald's statistics with results presented in Table 2 showed the highest value with CA in the periods of two weeks month and two months. In the three-month period, Wald pointed to the significance of AA. The influence of predictors on the dependent variable over seven days showed significance in CA, SD, and AA. However, in the 14 day period, only two predictors, CA and AA, showed statistical significance, which affected the product's life expectancy. In the next step, Kaplan-Meier (Figures 2-6) survival probability charts for one month with division for user parameters, and the three user groups were analyzed. The diagrams show the emergence of a growing number of increasingly shorter episodes that, at the border, seek the real function of survival. Figure 6 shows a survival model without division into classes as a general model for divisional and nondivisional variables.
The next stages show statistical regression models with division into aggregated groups of adopters, i.e., AG1-AG5. Regarding the explanation of these classes, we can refer to Table 3. Regression analysis was divided into two groups of variables, and product life is a dependent variable. The first group of variables (predictors) include average variable values from CA to AA. The second group include experience-related variables, i.e., MSG_in, MSG_out, FR_in, and FR_out. In Table 3, we can see the statistical-significance parameter (p) and the strength-of-significance factor (f).
The first group of predictors for AG1 showed significance for CA and AA. Average predictor AA was characterized by the strongest impact. For AG2, the case was definitely different. Four of the five predictors, i.e., CA and CP to AA, were significant. The only predictor that did not have statistical significance was average predictor SD. The impact forces of the predictors, especially in SP, were characterized by a strong accent. For AG3 and AG4, regression analysis showed similar significance to AG2 also for four predictors, but in these cases, lack of predictor significance in relation to dependent variables was shown by the mean of AA variables. In both cases, the CA variable strongly affected G5 results, where things were quite different. Statistical significance was only demonstrated in three cases: CA, CP, and SP.
The second group of predictors that affect the dependent variable also showed variability. In AG1, one of the four predictors was statistically significant, namely, FR_out (0.03). The situation looked completely different for AG2. Here, we can clearly see the strength of joining two classes. Significance statistics showed a positive result for up to three predictors, i.e., MSG_out, FR_in, and FR_out. In the case of AG3, AG4, and AG5, statistical significance was shown by 100% of predictors from FR_out, being the one that acts the strongest on the dependent variable.
The next part of analysis was based on an intergroup comparison of user characteristics between products with different survival time. In order to compare individual lifecycles with the Activity and Experience factors, we used the Mann-Whitney U Test. Analysis was presented in four perspectives: analysis of individual user classes, analysis of aggregated user classes based on activity-factor analysis of individual user classes, and analysis of aggregated user classes based on Experience Factors. Periods that we compared with each other are visible in Table A1. By starting division-variable predictor analysis for innovators, we can see the lack of significance of parameters at the first comparison period. In the next two, we can see that predictor CA was significant, which indicates that the periods significantly differed from predictor CA. In the last pair of compared periods, predictors CA, SD, and SP showed the largest differences.
Statistics for innovators show us a tendency for the comparative period to be smaller, in this case, two to three months, so more predictors influenced the differences. Analyzing the four other user classes, we see the opposite relationship. Starting with early adopters, where the differences could be seen in the four predictors in the first two pairs, in the next two the number of differences decreased. In the cases of early-majority, late-majority, and laggard users, significance statistics that point to differences are slowly blurred, as in the case of laggards, where in the last group of period comparisons we see the lack of significance of the given predictor data, which indicates low differences. Based on the aggregated user classes, we can see that the first combination of innovators and early adopters positively affects predictor significance, and this indicates large differences for most of the analyzed pairs (from three to four strongly affecting differences). We can see that the shorter the comparison period is, the smaller the differences are, such as two months versus three months. By analyzing the statistics of nondivisional variables that also include four period pairs, we see that statistics for the innovators themselves did not show any significance. We can see statistical significance at subsequent classes. Analyzing the remaining classes together, we see that differences in individual periods clearly increase. So, for early adopters, when analyzing the last two pairs of periods, statistical significance was less than 0.05, which indicates an increase in differences. By analyzing the last group, laggards, we could see that, in each group of periods, differences are clear and quite significant in each of the period pairs being compared. The same applies to aggregated users. Here, we compare the first period and, only in the case of AG2, in 14 days versus three months, we see the lack of slight differences between predictors.
Other groups indicate strong differences, as we can see in Tables A2 and A3.

Survival-Time Prediction with Early-Adopter Characteristics
For survival-time, a prediction dataset containing the usage statistics of 195 newly introduced products was used. Usage statistics for each product are defined by product and user identifiers, and a timestamp representing time when the product (in this case, the avatar) was used by a specific user. For each product, only the first usage per user was taken into account. For each product, data were collected from a newly added product starting from product launch until last product usage. For each analysis, two sets of variables were used based on the User Activity and User Experience factors presented earlier in Table 1.
In Figure 7, we see a high increase in the CP variable for products with seven-day survival with simultaneous small CA values. In other periods, we see density with slight deviations, as in the case of the three-month period, where we see growth in the CA variable; in a 14-day survival period, an increase in the SD variable was observed. As in the previous chart, Figure 8 shows a clear division into survivalperiod groups. Within seven days, an increase in the AA ratio with a simultaneous drop in SD was visible, which may indicate a drop in interest from users with low SD. In the remaining periods, we can see in Figure 9 a clear decrease in the SD index with a simultaneous increase in CA; this showed that the more users communicate with others, the less likely it is for the product to be accepted.
In the case of this chart, we can clearly see that the fewer users log in, fewer messages are sent to others, and fewer sent to the circle of potential friends. There is a clear decline from that period to the next. In the case of the last graph, Figure 10, we can see density against the FR_out indicator at initial values oscillating at 150-250. Here, however, we also see a decline from period to period. In the initial period, the MSG_in indicator is small, but increases with survival time. However, the last period (three months) oscillates near the first period, which indicates a lower number of messages sent by the users adopting products in that group. Results from all Experience Factors and Activity Factors are presented in Figure A1 and Figure A2 within Supplementary Information.    Another stage investigates how the number of analyzed adopted users from 1% to 100% and their characteristics affect classification accuracy for the prediction of product lifetime and survivalclass assignment (one week, two weeks, one month, two months, three months). The selection of observations to the training dataset was randomly performed; therefore, to stabilize the results, we repeated and averaged classification one hundred times for each dataset measure to obtain accurate information.
The experiment was carried out in three training-dataset sizes: 25%, 50%, and 75%. Classification and the decision-tree model were implemented with the help of the scikit-learn machine-learning library for the programming language Python. Classification was performed and, in the first stage, user-activity factors were used. Results are presented in Figure 11. They show high classification accuracy achieved for the training set based on 50% and 75% of the analyzed products. Accuracy at a level higher than 90% is achieved with less than 20% of product-usage statistics with activity factors taken into account. The training set based on 25% of the products delivered low accuracy, with a percentage of adopters lower than 60%, but it reached 90% when 70% of data were used for each product. Higher fluctuation of results was observed with a low number of analyzed adopted users. Figure 11. Accuracy of classification results with the use of Activity Factors for 25%, 50%, 75% training set and 10%, 20%, ..., 90% of adopters used.
Detailed numerical results are presented in Table 4. It shows that analysis of characteristics of even only 10% of product adopters makes it possible to predict product assignment to a class with low or longer survival time. Apart from social activity, factor classification was performed with the use of incremental data about user activity within the system. Results are presented in Figure 12. It shows that, for 50 and 70 training, the initial accuracy of the used low-fraction data at 1-3% of used data is very high due to innovator characteristics. Additionally used data dropped accuracy to 80%. Subsequently, it grew with data acquisition. For the training set with the size of 75% of used products, the lowest accuracy was achieved after using data from 15% of adopters, and 65% accuracy for 50% of products with 15% of the adopter sample used. For all training-set sizes, accuracy continuously grew together with the increased number of adopters.
Detailed numerical results for classification based on incremental usage statistics represented by Experience Factors for each user are presented in Table 5. Table 6 shows classification-accuracy statistics with identified user groups as innovators, then innovators together with adopters, and extended by early majority, late majority, and laggards. For Activity Factors, this shows that even using data from only innovators (2.5% of first adopters) creates the ability to assign a product to one of five adoption classes. Innovators used together with adopters delivered results above 19% for training sets with 50% and 75% size. Classification based on the 25% training dataset delivered accuracy above 18%. Further connection of the adopter group slightly improved classification, but from a practical-application perspective, it delays the time during which product survival abilities are predicted and additional adopter targeting is performed. The worst results were obtained for Experience Factors, but they were still above 80% accuracy for the training sets with 50% and 75%.

Discussion and Conclusions
For expanded virtual product usage within online systems, new analytical models and strategies are required. Common phenomena to offline markets are regularly seen in electronic systems and are identified with lifespan, customer habituation, and new-product improvement techniques. This research indicates how the attributes of early adopters to new items can influence user engagement and the survival of virtual goods within dynamic electronic environments. Achieved results, from product classification based on decision trees, showed that it is possible to predict product lifespan with the use of adopter characteristics. Adopter communication activity, represented by Activity Factors, positively affected product survival time. This shows that adopters with high experience factors are the main influencers in the system, and their behavior is adopted by other users.
Monitoring of product-usage patterns and adopter characteristics makes it possible to identify products with possible low survival time, and invite additional adopters with the use of incentives and other techniques. Gathered knowledge can be used to reduce the habituation effect and increase product-usage time due to social influence and follower behavior.
Results from the conducted study lead to the following main conclusions: characteristics of early adopters related to social activity positively influence product lifespan and the engagement of other users within the system; product lifespan can be estimated with the use of initial-audience and early-adopter characteristics; the combination of innovators and adopters positively affects the statistical significance of the dependent variable that represents survival time; initial-user characteristics can be used to classify products in terms of future usage for the detection of low-potential products, for performance improvement and targeting additional adopters with the desired specifics.

Funding:
The work was supported by the National Science Centre of Poland, the decision no. 2017/27/B/HS4/01216.

Conflicts of Interest:
The authors declare no conflict of interest.