Horizontal vs. Vertical Recommendation Zones Evaluation Using Behavior Tracking

Featured Application: The proposed e-commerce customer preference monitoring (ECPM) tool can be utilized for any website, but especially e-commerce websites, for collecting a stream of user-activity data, which could be used to adapt an interface to ﬁt user needs in real time. Be-sides personalizing a website interface, observing user activity can provide information about their preferences, which can be used to dynamically individualize different type of content presented to user, e.g., recommendations of products. Observing user interactions with computer interfaces can lead to better understanding of user needs and, thus, build an intelligent recommender system which will understand and guide each user through the cognitive process of discovering products that would meet individual preferences. Abstract: Recommender systems play a vital role in e-commerce by increasing the likelihood of transactions and improving sales thanks to presenting personal recommendations. Due to the marketing habituation effect, users are less and less responsive to this type of content. Visual recommendation presentation, in particular the recommendation zone layout can inﬂuence the effectiveness of a recommendation. This study examines human–computer interactions for vertical, horizonal, and mixed layouts of recommending interfaces of four major e-commerce stores, and is based on our document object model events-based behavior analysis tool. Results from this implicit feedback study are presented and analyzed, showing that vertical recommendation zones attracted more attention than horizontal ones.


Introduction
Permanent development of electronic commerce, in particular lately due to the COVID-19 pandemic, requires advanced tools facilitating online purchases especially for firsttime online shoppers. Recommendation systems, whose goal is to provide personalized recommendations of relevant products, play a vital role in achieving this goal.
Although the convenience of online shopping is very tempting for customers, the lack of personal assistance of a salesperson is a vital disadvantage in comparison with live shopping. When the assortment of goods from which a customer seeks to make a selection is large, this disadvantage takes on even more importance. In order to face this challenge, online stores more and more often count on personalization solutions such as recommenders. Recommendation relevance and its personalized character plays a key role and helps build long-term relationships. Unfortunately, users often do not notice the displayed recommendations because of speedy browsing and large amounts of marketing content. In order to the reduce the habituation effect of online advertising content, a decision-support model based on COMET, a multi-criteria decision-making method which uses elements of fuzzy sets of theory for representing attributes for decision-making criteria, can be used [1]. The way a recommendation is presented, its positioning and usability seem to play an important role in the final effectiveness of the recommender.
For the purpose of feeding a personalized recommending system, a user model is generated on the basis of users' demographics, their transactions, ratings, and other behaviors [2]. A user profile, which typically reflects the user's needs and/or preferences, is a digital representation of this model. Websites can collect a wide range of demographic and user behavior data [3][4][5][6][7][8] that are used to generate user profiles and create personalized recommendations for the benefit of better sales. Beside the e-commerce area, recommendation engines can also be used for election recommendation where both the candidate's and the voter's preferences can be described in an imprecise way [9].
Recommendation engines mostly use the well-established collaborative filtering (CF) that allows the recommendation of items that were liked by other users with similar tastes. The selection of the best collaborative filtering algorithm in terms of diversity and computation time should be done based on e-commerce input data characteristics [10]. The main disadvantage of this technique is the lack of recommendations for new users and new items, which is known as the so-called 'cold start problem'. Until systems learn user preferences, it is impossible to present accurate recommendations. Another kind of recommendation engine-content-based-presents products that are similar to products that the user had previously shown interest in. A big disadvantage of this approach is the lack of novelty in the recommendation items set, as it is purely based on the recognized users' interest. Less-popular recommendation techniques include demographic and knowledge-based approaches. The first technique assumes that demographic niches share similar interests and thus can be recommended similar products. To achieve the highest quality of recommendations a demographic profile is required, which often constitutes a serious limitation in the online environment. Knowledge-based recommenders require domain knowledge about how particular items and its features fulfill specific users' needs. This functional knowledge can be expressed in the form of case-based reasoning rules, in which items are recommended as cases and recommendations are generated based on selecting cases most similar to user needs or profiles.
Recommendation systems due to their multi-domain applicability are among the main topics of scientific interest in e-commerce in recent years. Comprehensive surveys of fifty papers devoted to recommender systems and surveys about recommender system applications are available [11,12].
In recent years, recommender systems have been improved with two important techniques. Firstly, the fuzzy approach has aroused great interest among researchers of recommending systems [13]. Secondly, deep neural networks are being incorporated into recommender engines with promising results [14,15].
Evaluation of recommendation algorithms has initially concerned only their prediction power, understood as the ability to accurately foresee user's needs. Nowadays the accuracy of recommendation still plays an important role, but other factors also need to be considered while evaluating recommender systems and algorithms. From the user perspective the most important evaluation criteria are: the novelty of recommendations (the ability to show recommendations of items that a user did not know before), serendipity (a measure of how surprising recommendations are), and diversity, which results in recommendations that are not similar and, thus, ensures a wider range of items potentially not known before. Trust in the recommendation system is another crucial factor, since a more-reliable recommendation should build up user's trust and result in higher usage of recommendations. Privacy also plays vital role as users more often do not want to disclose their personal preferences. Other criteria used to evaluate recommender systems are: coverage, which refers to both item space coverage (measured as the percentage of all items in a catalog that can be recommended) and user space coverage (the proportion of all users for whom the system can provide personalized recommendations). More technical criteria of recommender systems evaluation include scalability, which shows the amount of resources the recommendation engine needs when the amount of data increases, and adaptivity, which represents the ability to generate recommendations when item collections change quickly [16].
The recommendation algorithm and therefore quality of generated recommendations is an important aspect, but the efficiency of a recommendation system goes far beyond that factor [17]. There are a lot fewer studies on the selection of best methods of presenting recommended products to the client than research in the area of recommendation methods and algorithms themselves. One can seek the most appropriate ways to make recommendations to users by observing human-computer interactions with recommending interfaces. People's behavior while interacting with webpages can be tracked by registering user's generated events inside web browsers [3] or by using gaze tracking solutions, and counting the number of times the user moved over or browsed a given element of the website, in order to learn user's preferences and generate a set of similar products constituting the basis for recommendations [18].
Studying visual aspects of recommending interfaces could bring benefits in terms of better integration of these interfaces with online stores, thus improving efficiency of recommendations [17,19]. The layout and position of the recommendation zone, the number of recommended products, the presence of a carousel responsible for scrolling zone content, the size of related photos, the color and size of product titles, prices, etc., can be assessed for interface improvement [20]. In times of information overload and inundating marketing content, the habituation effect often appears, resulting in the phenomenon of so-called banner blindness. It is possible that best recommendations from the algorithmic point of view can have little impact if they are not optimally presented [21][22][23] i.e., wellpositioned on the website, at the right time while shopping, and with the right intensity level [24][25][26].
This paper focuses on evaluating the effectiveness of horizontal vs. vertical recommending interface layouts for four major e-commerce sites and is based on an implicit feedback study performed via a specially developed browser add-on.
The rest of the paper has the following structure: research assumptions and methodology are presented in Section 2, the experiment architecture and results are presented in Section 3, and conclusions and future study plans are presented in Section 4.

Assumptions and Methodology
The objective of our research was to compare the efficiency of horizontal and vertical recommending interface layouts with regard to attracting customer interest, from the perspective of user experience and business goals, basing only on implicit data collected for a few major e-commerce sites.
Users are most interested in the main editorial content of a website, yet changing visual aspects of the recommendations in order to attract as much user attention as possible may well influence user's interest in offered products and achievement of business goals. Determining user's interest can be accomplished by asking them explicitly and in more detail through questionnaires or by asking them to rate products on a simple scale consisting of five stars. Unfortunately, asking questions in this way can disturb natural behavior while browsing the website and it is perceived as an unwanted obligation [27][28][29]. As an alternative to asking users explicitly, their preferences can be inferred implicitly by observing users' interactions with the website. These solutions allow the studied subjects to concentrate only on the main task, without extra cognitive load and the necessary willingness to explicitly rate or write reviews about displayed products and recommendations [30,31].
Eye tracking may be utilized because eye movements tend to be connected with the cognitive process [32,33]. Therefore, one approach is to depend on data from an eye tracker to determine web user attention, interest, and time spent on a certain area of a website, which may serve as indicators of attractiveness [34][35][36]. Eye tracking can also provide evidence data about cognitive process during performing various tasks e.g., debugging code and correcting errors [37] or efficiency in learning a new programming technology or language [38]. Another technique for implicitly monitoring user behavior on websites is programmed solutions such as scripts or browser extensions that allow the logging of events which result from interactions with the webpage on the client's side. Thanks to this method, it is possible to discreetly observe user behavior without an additional cognitive burden [39] or the need for special equipment such as an eye-tracker.
An e-commerce customer preference monitoring (ECPM) behavior-tracking tool was created to gather a spectrum of e-customer activity data [3]. Our monitoring tool has been programmed in the form of an extension of Mozilla Firefox and its installation is as simple as any other extension. The core monitoring code was developed in a way that allows for external use on any website. ECPM allows for the monitoring of human-computer interactions through the use of a DOM (document object model), which represents an HTML document as nodes and objects in the object oriented manner. When using a DOM of a HTML webpage and JavaScript language, ECPM registers various event handlers on objects in a page. Our tool silently collects data about viewed product pages together with user interactions. Parameters denoting physical page attributes, content attributes and recommendation interface attributes are registered during those interactions. Numerous parameters related to the recommendation interfaces include the times the cursor is located in recommendation areas, their physical size, and registered product interest. As shown by other studies the motion of the mouse cursor is correlated with eye motion and, thus, user interest [18,[40][41][42], and a number of behavior-related parameters can be deduced, similar to previous studies [43,44]. It has been shown that mouse and keystroke tracking can definitely be used as a lower-quality yet even-less-intrusive alternative to gaze-tracking which can only be performed during a controlled study requiring additional equipment, its calibration, and supervision of a research worker.
The times measured based on the cursor being positioned on the featured product recommendation zones were related to other metrics such as the total presence time on the product page, the height and length of the page, the number of characters in the page, and the number of product pictures. Those parameters were used to create relative measures of cursor times positioned in individual recommendation areas and were generated client-side and calculated server-side. Thanks to this approach, it is possible to observe user behavior without additional attention and burdensome preparation, and in a discreet way [39].

Implicit Event Tracking
Our tracking tool was set up for five very popular online stores in Poland: Merlin.pl, Agito.pl, Electro.pl, Komputronik.pl, and Morele.net. While Agito.pl and Merlin.pl offered numerous kinds of goods (horizontal shops) at the time of the study, Electro.pl and Morele.net offered mostly electronic goods. Merlin.pl was a major Polish online book store. Those stores were selected as they are very popular in Poland and study participants were not required to learn a new interface and could browse content in their native language.
There were 85 study participants who were all volunteers and active web users from Poland, aged from 19 to 33, holding a high school degree or higher. During the study 1396 products were rated and all customer interactions with websites were monitored via ECPM. About half of the participants rated below 14 products, while the upper quartile rated more than 20 products. Higher ratings were more popular than lower ones. Among many parameters, the ECPM tool monitored user interaction with recommending interfaces. The main monitored measure was total time that the mouse pointer was positioned over other recommended products' sections. This measure was used to reflect user interest in the recommendation section.
The participants' task was to search for interesting products and explicitly rate them. On leaving every product page, a star rating scale was displayed, where a user could express his/her product interest and inform as to whether the product had been known to them before. In the study 1396 items were assessed and user interactions with webpages were monitored using ECPM. The participants' activity was observed at the most granular level. All DOM-triggered events related to keyboard and mouse were recorded together with detailed information about the source element and its position in the structure of a web page. In addition, the source code of each visited webpage was collected. The monitored data were used to calculate metrics for user behavior that was classified into four groups: parameters describing webpage attributes, interaction times (ms), user interaction events, and relative parameters connected with the events. About 50% of the participants rated fewer than 14 products, while the Q3 quartile rated above 20 products. Table 1 shows the distribution of star ratings. Among many parameters, the ECPM tool, in particular, recorded interactions with recommendation zones. One of the measures was the time of mouse pointing over recommendation areas. This measure was considered as a reflection of interest in recommendation zones.

Recommending Interface Quality Parameters
Events data collected with ECPM contained dozens of parameters regarding an observed event and the related HTML element. Factors regarding user behavior in recommendation interfaces as well as their features and product page attributes were extracted. Collected rows contained the following fields: layout type of the recommendation interface (horizontal/vertical), rc_layout; text length in the recommended products section, recom-mended_length; text length inside all text elements visible on the page, document_length; time between webpage load and unload events, page_time; total time when browser tab enclosing the monitored page was active, tab_active_time; total time when user was interacting actively with the webpage (registered on the basis of generated mouse and/or keyboard events), user_active_time; time when mouse pointer was located inside recommending interfaces, prod_recommended_time; time when the pointer was placed over recommending interfaces in relation to text length inside recommendation. rel_recommended_time_recommended_length; time when the pointer was placed over recommending interfaces in relation to text length contained inside all text areas in the page, rel_recommended_time_document_length; time when the pointer was placed over recommending interfaces in relation to time between webpage load and unload events, rel_recommended_time_page_time; time when mouse pointer was placed over recommending interfaces in relation to time when browser tab enclosing the monitored page was active, rel_recommended_time_tab_active; time when mouse pointer was placed over recommending interfaces in relation to time when user was interacting actively with the webpage (registered on the basis of generated mouse and/or keyboard events), rel_recommended_time_user_active.
Simplified layouts of all studied e-commerce websites are presented in Figure 1. Komputronik.pl e-store was not taken into account in further analyses as it contained recommending interfaces only in vertical layout.

Results
The parameters listed in the previous section were used to compare effectiveness of different recommendation layouts-horizontal vs. vertical. As a measure for effectiveness approximation, time of mouse pointer being located inside recommending interfaces was used. Due to the task given to participants as well as general study construction, this measure was selected as the most suitable one for comparing recommending interfaces. It was

Results
The parameters listed in the previous section were used to compare effectiveness of different recommendation layouts-horizontal vs. vertical. As a measure for effectiveness approximation, time of mouse pointer being located inside recommending interfaces was used. Due to the task given to participants as well as general study construction, this measure was selected as the most suitable one for comparing recommending interfaces. It was used together with measures related to various physical attributes of the product page. Since mouse position is highly correlated with eye gaze, especially when a user is actively browsing and scanning the page, thanks to being able to track the position with ECPM, we could measure user interest in particular parts of the page, especially recommending zones.
The attractiveness of particular recommending interfaces resulting from analysis of data collected with the help of our ECPM tool is presented in Table 2 in the form of interest indicators, which are described in Section 3.2. The average time of mouse pointer being positioned on recommended items was higher for vertical layout than horizontal layout for three shops: Electro.pl, Agito.pl, and Merlin.pl, while it was the opposite for Morele.net. This may have resulted from the fact that there were more horizontal recommending zones than vertical ones in this particular store's layout (two horizontal zones versus one vertical). Another parameter which expresses the interest level in presented recommendations relative to page length is rel_recommended_time_recommended_length. For all four shops, vertical recommendation zones attracted more, or much more, attention compared to horizontal ones. In the case of Electro.pl, interest in vertical recommendations measured as the parameter mentioned was 116%, while for other three shops it was 20.3 to 23.1%.
Cursor activity time over recommendation interface measured in relation to total page time (rel_recommended_time_page_time) as well to tab activity time (rel_recommended_time_tab _active) and user activity time (rel_recommended_time_user_active) also proved higher interest in vertical recommendation interfaces compared to horizontal ones in all shops but one (Morele.net).
The number of pages where users significantly moved the cursor over recommending interfaces (registered prod_recommended_time) compared to the same event over product review sections (prod_review_time) was in favor of recommending interfaces (Table 3). They attracted customers more often in all stores, and Morele.net gained the highest advantage of 386%. Table 3. Comparison of registered positive mouse activity between recommending interfaces and product review interfaces.

Conclusions
The research results presented in this paper show that the layout of a recommendation interface in an e-commerce website is important for its attractiveness based on data from four real online stores, and this may therefore have an influence on sales. The direction of a recommendation zone seemed to have an impact on user behavior. Thanks to the presented methodology and tool, unobtrusive behavior tracking using DOM events as well as the collection of physical parameters of recommending interfaces was performed.
There are several main conclusions. On the basis of the preliminary study performed for four major e-commerce sites in Poland we confirmed the advantage of vertical recommendation zones over horizontal ones with different measures calculated thanks to implicit activity tracking. Users spent relatively more time interacting with vertical recommending interfaces. This study was in line with our previous eye-tracking experiments [43], which among others suggested that vertical recommending interfaces result in more interactions of adding products to the cart. Recommending interfaces also tend to gain much higher user attraction compared to other page sections such as product reviews. Based on the results of the study, we also assume that implicit events-based activity tracking can be a valuable substitute for eye tracking, which may usually be performed only in closed laboratory conditions. Observing users' activity on e-commerce websites can allow us to build intelligent recommenders which will guide users through the cognitive process of discovering products and services that are best suited to their individual needs and preferences.
The main limitations of the study are the limited number of participants-85 volunteers -and the limited types of devices used. All the participants were using standard desktop/laptop computers operated with mouse/keyboard interface. A similar study should also be performed for touch interfaces, especially for mobile devices such as smartphones, and devices with larger touch screens such as 2-in-1 laptops. Devices with touch interfaces have a growing share in e-commerce web traffic, so performing a study for these devices is planned as part of our future research.
In follow-up studies, we are going to verify the attractiveness of different recommendation zones in more e-commerce stores that we are starting R&D cooperation with, probably using a hybrid implicit activity-tracking mechanism, based not only on DOM events, but also eye tracking. Events-based solutions will be used to track activity, including time spent displaying particular elements of recommending interfaces on personal computers as well as mobile devices and tablets, while eye tracking will be used for gathering more detailed data, focusing on visuals such as layout particularities and intensity, in order to find best approaches to solve the problem of optimal presentation of recommended items.
Author Contributions: Conceptualization, P.S.; methodology, P.S. and T.Z.; software, P.S. and T.Z.; supervision, P.S.; validation, P.S. and T.Z.; formal analysis, P.S. and T.Z.; investigation, P.S. and T.Z.; resources, P.S. and T.Z.; data curation, P.S. and T.Z.; writing-original draft preparation, P.S.; writing-review and editing, P.S. and T.Z.; visualization, P.S. and T.Z.; project administration, P.S. and T.Z. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki. Ethical review and approval were not required in accordance with the local legislation and institutional requirements.

Informed Consent Statement:
All participants were informed about the purpose and the procedure of the study and provided their informed consent prior to the experiment. The experiment did not involve any risk or discomfort for the participants.

Data Availability Statement:
The data presented in the study are available from the corresponding author, upon reasonable request, the code used during the study is proprietary.