Consumer Attitudes toward News Delivering: An Experimental Evaluation of the Use and Efficacy of Personalized Recommendations

: This paper presents an experiment on newsreaders’ behavior and preferences on the interaction with online personalized news. Different recommendation approaches, based on consumption profiles and user location, and the impact of personalized news on several aspects of consumer decision-making are examined on a group of volunteers. Results show a significant preference for reading recommended news over other news presented on the screen, regardless of the chosen editorial layout. In addition, the study also provides support for the creation of profiles taking into consideration the evolution of user’s interests. The proposed solution is valid for users with different reading habits and can be successfully applied even to users with small consumption history. Our findings can be used by news providers to improve online services, thus increasing readers’ perceived satisfaction.


Introduction
Newspapers have been a privileged form of communication over the past hundred years or so. Over the course of its long and complex history, it has undergone many transformations. Although the first newspapers can be credited to the ancient Romans, only when printing techniques became more advanced, during the Industrial Revolution, newspapers turned into a more widely circulated means of communication. A second revolution took place with the technology advances in the Information and Communication Technologies, which transformed the newspaper industry into a digital, online news industry.
This novel paradigm has also been boosted by the requirements and expectations of a new generation of young adults, the Digital Natives, who grew up with the Internet and ubiquitous digital technology and devices. This part of the population has quite a different approach on learning, experiencing the world and reading habits. Digital Natives do not usually read traditional print newspapers, but yet, they can be remarkably informed and eager for information. It may be through browsing the headlines of online newspapers, reading the Web-links posted in social networks (Facebook, Twitter, etc.) or through headlines fed to their mobile phones. For these consumers, digital news introduce a number of powerful advantages that are traditionally absent in the printed environment: interactivity, non-linearity, immediacy in accessing information, and the convergence of text, images and video [1].
In this evolving scenario, mobile devices are the latest to challenge traditional news-delivering methods. As they become more powerful, with larger screens and ubiquitous Internet access, major media companies are investing more resources into creating new ways of reaching larger audiences via mobile platforms. The International Telecommunication Union (ITU)'s statistics (https://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx) show that there are more mobile device subscriptions in the developed world than there are people. This widespread use of smartphones and the different features offered, such as web browsing, gaming, video-messaging capabilities, allow keeping users up to date with the latest news, anywhere and at any time. As a consequence, the development of mobile news solutions has a significant impact on reader's news consumption patterns. Publico (http://www.publico.pt) a Portuguese newspaper, is a concrete example of this new scenario. Internal figures show that since the beginning of 2014, the number of visits coming from mobile devices experienced an increase of 130%, while in the same period the number of visits from desktop devices suffered a decrease of 36%. This scenario is confirmed by larger worldwide studies that show that, in 2019, two-thirds (66%) of the online readers used the smartphone to access news (+4 pp) [2]. Comparing with a previous report, the growth trend is visible: in 2014, a fifth (20%) of online news users across all countries identified the mobile phone as their primary access point [3].
Predictions show that the news industry should consider new ways of producing, consuming, and linking content, the integration of traditional journalism and social networks, the use of new navigation paradigms and the personalization of the services [4]. Some of these ideas are not pure speculation. For example, Publico reports that in some of its sections, more than 70% of the accesses come from Facebook while, in others, Google is the source used to find an article. The use of a responsive design is also coincident with the growth of mobile accesses.
One of the main open challenges is on providing users personalized access to the huge amount of information produced daily and on avoiding information overload that might contribute to a bad quality of the reading experience. Recommender systems can contribute to customer loyalty by creating a trust-added relationship between the brand and the customer and consequently to increase economic revenue, by providing readers what they would like to find. Personalized functionalities play an important role in building a successful business model for highly rated Internet sites such as Amazon, YouTube, Netflix, or Yahoo and those who cannot deploy them are at profound competitive disadvantage [5].
In this study we question different factors that may contribute to an effective personalization of news services. More precisely, the impact of recommended items, screen layout, and history of consumption on the users' satisfaction and profiling has been investigated. Conclusions were validated by a prototype built with a specific news' provider and a user testbed.
The remaining of the paper is organized as follows. Section 2 provides an overview on recommendation systems, their application in the domain of news and approaches to create user profiles. Section 3 presents the main research questions to be addressed in this study. The used recommendation and profile-building approaches are presented in Section 4. Section 5 goes into details on the experiment setup, including the characterization of the participants and the implementation of the methodology, namely the description of the tasks the users were asked to perform. Results are presented and analyzed in Section 6 while Section 7 provides an additional discussion on the findings and points out some guidelines for future work. Finally, Section 8 presents the main conclusions of the experiment.

Recommender Systems
Recommender systems are systems with the ability to provide suggestions or to direct a user to a service, product, or content, with the aim of maximizing the personal interest. Decades of research have produced multiple recommendation algorithms [6], mechanisms for evaluating their effectiveness, including user studies [7], online evaluations A/B tests and offline evaluations [8], and user interfaces and experiences to embody them [9]. Today, the interest in this area still remains high and current most highly rated Internet sites as Amazon, eBay, or YouTube use recommender systems as part of the services they provide to their subscribers.

News Personalization
The amount of available news has been growing continuously as news publishers keep producing new contents. At the same time, consumption of news articles has shifted more and more from the traditional printed newspaper toward the web. As traditional newspapers face a financial beating, the number of news media sources are expanding on the Internet, helped by low entry costs [10]. Simultaneously, alternative sources of news and information are becoming available to a potential vast audience via an increasing number of wireless devices, such as notebooks and smartphones. This information demand presents itself a key challenge for news providers on helping users finding, fast and accurately, which news best match their interests. Users usually have their own preferences concerning both news content and news sources. However, as databases of news sources change continuously, identifying and following every single news item that is published and that could be of interest to a reader is almost impossible [11]. Manually searching and analyzing the available news articles, to select those considered interesting, is not feasible within the time constrains common for most users. In this scenario, recommendation systems can largely improve the efficiency and accuracy of information acquired [12]. These assisting tools, that guide the reader in the content selection process, can contribute to increase his loyalty to a specific service/product and thereby contribute to improved cost-effectiveness for the service provider.
Additionally, in a scenario of an ever increasing deployment of news built on rumors instead of on substantiated facts, evaluating the credibility and trustworthiness of all that is made available on the Internet, is also an important aspect. Recommendation systems that implement mechanisms that find common characteristics of false or misleading information, can also play a role in the process of providing accurate information.
News personalization has been extensively studied from many perspectives [13] and several approaches can be found in the ways user profiles are generated and in the filtering techniques applied. One of the earliest examples of a news recommendation system includes GroupLens project [14], which focused on filtering news articles from Usenet. The proposed methodology works by first letting individual users read and rate online news articles and then predicting future ratings of articles based on the guess that users who had similar opinions in the past will probably have similar opinions in the future. For example, if Alice and Mary both rated a lot of news articles similarly in the past, then perhaps Alice will like the article that Mary just liked. Recommendation is then based on finding similarities among users, a process known as collaborative filtering.
Other representative news recommendation systems adopt a different methodology, recommending newly published articles similar to those a given user has read in the past [15,16]. In this more common approach (content-based filtering), the user's interests are inferred by considering the features (title, description, content, and category) in the items with which he/she interacted and favored in the past and providing similar items matching the previous preferences. Reference [17] presents a more refined approach by considering rich semantics and a domain ontology for delivering the most interesting news items to the user.
Alternatively, to using one specific algorithm individually, combining multiple methodologies has also been explored. The combination of content-based, collaborative-based, or popularity-based methods are common examples of hybrid recommendation solutions [18]. Other methodologies that combine time-based, georeferenced and social data are also emerging [19,20].
Methodologies considering user location [21], mobile devices [22][23][24], or social networks [19], have also emerged because of the dissemination of these devices and applications among the population. Besides the academic community proposals, several adaptive commercial news recommending systems, such as Google News and Yahoo! News, already provide personalized news services for a substantial amount of online users.

User Profiling
The user profile is the core element of most personalization systems. The first approaches of user modelling date back to 1979, when the need for personalization and distinct treatment of individuals has already been underlined [25]. User profiling can be understood as a set of data that represents user cognitive skills, intellectual abilities, intentions, learning styles, preferences, and interactions with the system [26]. Contextual features, such as mood or location, can also greatly improve the ability to more accurately model a user and improve recommendations. The main motivation of building user's profiles is that users differ in their interests and preferences and exploring these differences is vital for providing users with personalized services.
In order to construct an individual user profile, information may be collected explicitly, through direct and conscious user intervention, or implicitly, through agents that monitor user actions or activity. Explicit profiling requires the user's active participation and the information gathered usually includes demographic data, such as user's age, gender, job, hobbies, or location. Implementation of explicit feedback includes simple checkboxes, text fields, a binary like/dislike, or a simple technique that allows users to express their opinions by selecting a value from a range. Although explicit methodologies lead, in general, to more reliable profiles [27], they also present multiple disadvantages given that a significant amount of users do not want to spend their time on this or they are tough on providing this information [28,29].
In contrast, implicit user profiling is the process of analyzing users' activities such as time spent or navigational history, to determine what a user is interested in [30]. Information can be processed automatically and the profiles can be updated and adjusted continuously [28]. In many cases, monitoring is accomplished without the user's consent and remains transparent to him/her, removing the burden associated with having to provide personal information. Although less intrusive, implicit profiling may raise several privacy concerns [31]. Additionally, implicit profiles often lack accuracy and reliability, because the collected data might be noisy, e.g., long reading times do not always indicate high ratings.
Profiles that can be modified or improved are considered dynamic, in contrast to static profiles that maintain the same information overtime. Examples of static information includes personal background, personality, or demographic information, and is not subject to continuous updates [32].
In the news domain, information is released and updated continuously, and many stories only remain online for a short period of time until new details emerge. This dynamism is not as relevant in other content domains, such as movies [33], music [34], or books [35], making the news domain quite specific and with special requirements. Dynamic profiles, that take time into consideration, may discriminate short-term and long-term interests [31]. Usually, short-term profiles are constructed by considering only the most recent actions of the user and may only take into account data collected during the current session. In some situations, such as holiday shopping purchasing suggestions, short-term profiles may be the ideal setup. However, in most general use cases requiring deep personalization data, a short-term profile might be inadequate because it does not provide enough quantity or depth of information about the user.
In long-term approaches, the user profile holds the sum of the data collected since the user began to interact with the system. Long-term profiles are built over time by analyzing, for instance, the user's searches, browsing, clicks, or other actions. Since such information represents the permanent/long interests of the user, generally stable for a long period, long-term profiles are usually considered more precise and reliable. However, these techniques require additional maintenance for updating the profiles after some new user interaction being detected.

Research Questions
In the current scenario of moving news services into a digital business, requirements to assure the best navigation experience include graphical elements, language, and organization of the information. Information is usually organized in sections, each containing a number of topic links for the reader to choose from. The purpose is to provide a lot of options up-front, so as to attract a wide variety of audiences and direct them to the information of interest [36]. The question raised, in such scenario, is whether cataloguing some news with labels such as "What to read next" or "Recommended for you," may or may not compel the user to select those news, even unconsciously. What is expected is that, because of time constraints or unwillingness to search for content, users may tend to click the news catalogued as recommended, expecting that these articles match their personal interests. This theory is also supported by the principle of least effort which states that each individual adopts the course of action that minimizes his/her energy expenditure, even if it means accepting information with a lower quality [37]. Considering this predisposed stimulus, would the user notice and click on these same news even if they did not have any description or were not included explicitly in one special sections?
Our research aims at analyzing several aspects of online news consumption, namely: (1) Evaluate the role of recommendation strategies on the user perceived importance of the information; (2) estimate the impact that editorial decisions (e.g., the position or layout used to present the information) might have on the consumption; (3) understand how the previous points depend on the degree of the reader's loyalty toward the newspaper.
Reference [38] takes the subject of examining users' attitudes toward news personalization by presenting participants with different personalized news editions and compared them with a nonpersonalized edition composed by randomly chosen articles. Participants tended to prefer the personalized editions, but this tendency was small. According to the authors, these results are probably due to the relatively small differences between the personalized and no personalized editions and the possibility of users not having considered these differences or even failed to notice them. [39] provides valuable insight to examine whether personalization would influence behavior toward a news website. In this study, one group of participants was told the news website had been personalized to its interests. The other group served as a control and participants were given no such direction, and recommended stories were labelled as "Most viewed." Although the findings of this study corroborate earlier research suggesting that readers prefer news sites customized to their interests [40], the results indicate that the fact that the users are made aware of the recommendations does not affect their clicking behavior. This work tried to understand how much customization and what type is needed in order to have an effect, by analyzing user attitudes when using a Web portal with different degrees (low, medium, high) of customization. The question was whether a greater degree of customization engenders more positive attitudes toward a portal that delivers personalized content. The results suggest that with customization, users' attitudes toward the site become more positive, and that a higher level of customization translates into more returns visits to the site's homepage and fewer visits to other competing services.
The use of simple methods (most popular, most recent, random, etc.,) to provide a news personalization solution has also been analyzed [41,42]. The conclusions of these studies suggest that in these simple personalization methodologies the recommendation quality could depend on the context and specific news domain. For instance, the most popular algorithm can reach a good performance in pick hours. Our study intends also to analyze the impact of news personalization on user's satisfaction and behavior but goes further ahead by comparing different aspects and approaches. This includes not only variations in the algorithm used but also the influence of the layout over users having very distinct history of usage. The first aim of our study is to examine, from a set of randomly disposed and unlabeled news, which percentage of users will show a preference for the recommended news over other presented news.
RQ1: What percentage of readers specifically select recommended news from a set of alternatives (popularity based, random) presented?
The next question is related to the way the information is presented on the screen. Most online newspapers take into attention several usability aspects, including content organization, navigational links, interface design, etc., that may have a major effect on how information is presented to and accessed by the end-user [43]. Reference [44] describes a qualitative study of online news reading and browsing behavior. In this study, the author explores how the layout impacts the users' engagement with online news. Its outcomes demonstrate that the use of images and headlines lead people to the content of the articles. As the interactions with the news website continued, usability and the presentation of the information continued to contribute to users' experiences, but the novelty or personal relevance of the articles also influenced engagement. In order to better identify the elements mostly attracting the users' attention, the authors in [45] describe an eye-tracking experiment on users when reading an online newspaper. Four metrics of ocular behavior were considered: (a) number of fixations overall, (b) fixation duration, (c) gaze time, and (d) saccade rate. These measures were especially useful for identifying fixation areas as well as variability in participants' scan paths.
Although these studies are all concerned with layout questions, such as the position of various graphical elements on a page, the role of photographs versus text, or the role of color in the layout, it is not yet clear how it shapes the way users approach news information. Furthermore, these studies consider the newspaper homepage at the global level and failed in exploring interaction with a specific section, such as popular, recent or recommended news.
In order to study whether the position where the recommended items are displayed in the screen layout, for example at the top or at the bottom of the page, has a significant impact on the user selections, the following question is proposed.
RQ2: Does the position where the recommended news are displayed in the screen layout influence the user's preferences?
One of the key technical issues in developing applications for news personalization is the problem of how to construct accurate profiles of individual users and how these can be used to identify a user and describe his/her reading behavior.
Short and long-term profiles provide key information to a correct understanding of users' interests. If we look into football-related news, we may discover that a user has a short-term interest in the topic if he/she only reads news during the World Cup, or a long-term interest if he/she is an assiduous reader of this topic.
To understand how a user's interests change over time, the authors in [46] conducted a largescale analysis of anonymized Google news click logs. Based on the log analysis, this study concluded that clicks in old histories became much less useful in predicting future interests. It also showed that interests are also influenced by the user location and traveling users may show interest in nearby events or happenings. Although some studies addressed the evaluation of both short and long-term recommendation algorithms [47,48], the analysis of users' preferences on time based or on geo located recommendation approaches is still a largely open question.
In our study, user preferences on short-term, long-term and georeferenced news recommendations are also investigated.
RQ3: Which of the two recommendation approaches, short-term or long-term, better matches user preferences? RQ4: What is the degree of preference for news related to the country where the user is located when compared with user preferences analyzed in RQ3?
With the abundance of news sources, there is evidence that some segments of the population rely on multiple sources of news that are not limited to the ones provided by news agencies: information can be made available in social networks or through news aggregators [49]. This fragmentation means that a significant portion of the population is no longer concentrated in a few providers but is spread across a large number of channels, making it harder for news providers to monitor consumption patterns. However, there are still users who read news regularly and are loyal to a specific source, for example through some kind of subscription, or simply because they found that source trustful. Typically, these users have a rather high consumption history. Given this highly differentiated set of consumption patterns in an online newspaper, can different types of consumption profiles be considered? RQ5: Does the consumption frequency behavior influence significantly the preferences identified by RQ1, RQ3 and RQ4? RQ6: Can the set of users that have not selected the list of recommended news, show similar preferences on short/long/geo approaches as identified by RQ3 and RQ4? Figure 1 presents the different elements of the implemented recommendation system methodology used to analyze the stated research questions.

User Model Profile Construction
The degree of complexity of a user profile depends on the purpose of the system. According to [50] some important characteristics must be found in an accurate profile: (1) The profile must be able to represent multiple interests in different topics; (2) the profile needs to have the ability to be updated dynamically, reflecting shifts of interests. The topics a user is interested in can be relatively stable or vary slightly in a long-term perspective, whereas the content accessed by the user might change along different short-term perspectives [51].
In some contexts, it may be useful to complement the user profile with some personal attributes such as the user's age, gender, language or location. The most noticeable benefit of using personal information includes the ability to infer the user preferences without requiring a history of the user interactions. With the popularization of hand-held devices such as mobile smartphones, access to news articles is performed at any place. Considering such circumstances, the choice of news articles might not only depend on the usual interests of a reader, but also on his/her location [21]. For instance, a person on a business trip may be interested in news about the country to where he/she traveled and where he/she currently stays in.
For this study we defined different user profiles that enabled evaluating the three main components: short-term preferences, long-term preferences, and the influence of the user' location. In our implementation, the profiles are created by implicitly collecting information on the user reads. To enable deploying the algorithms in a real environment, aspects related to computational resources required were also taken into consideration.

Short-Term User Profile
The short-term profile tries to capture recent or seasonal interests. In this work, the short-term profile is based on registering the numeric ID of the set of news the user has recently read. This ID, assigned by the news provider, enables uniquely identifying each of the articles. The system maintains short-term profiles by considering only the most recently read news in one specific session: the data collected are only considered valid until the next user session. The use of a single session for creation of a short-term profile is supported by several authors [52,53].

Long-Terms User Profile
In order to represent long-term preferences, a keyword profile was adopted. A set of relevant keywords are extracted from the articles read by the user during his browsing activity and each profile is represented in the form of a keyword vector. Each keyword is associated with a numerical weight representing its importance (number of occurrences) in the profile and is expected to express a topic of interest.

Georeferenced User Profile
Although the geolocation module of a mobile device is able to return very precise information that enables identifying the city or even the street name, our implementation only considered the country where the user was located at. The reason for this is that news hardly have very detailed metadata schemas on location. Our approach can be refined and granularity of the location enhanced if news providers start following a strategy of more detailed description/identification of the event.

Delivering Personalized News: Recommendation Methodology
In order to produce news recommendations to a user, three recommendation methodologies were considered according to the different profiles: short-term, long-term, and georeferenced. The final recommendation list was constructed by combining all outputs from each of the recommendations' approaches.

Long-Term Recommendations
Long-term personalized news were obtained by a simple match of keywords/terms which coexist in the content of the candidate news article and in the user keyword profile. Despite the fact the profile may have a large number of terms, only the top N terms were considered not to scatter his/her preferences.

Short-Term Recommendations
Short-term recommendations are based on recommending news with similar topics to those the users have recently read. In order to estimate news similarity, a clustering methodology, applied to the entire publisher news database, was first performed to segment news into distinct topics called labels. The clustering algorithm consisted of five sequential phases: preprocessing (stemming and stop words removal), frequent phrases extraction (finding recurring ordered sequences of terms), cluster label induction (finding a term or phrase that can be used to describe a cluster), cluster content discovery (allocating items to the clusters), and finally, cluster formation (selecting the most representative clusters) [54]. Careful selection of cluster labels is crucial -the algorithm must ensure that the labels differ significantly from each other and at the same time, cover most of the topics in the input collection. For example, in a set of documents about European politics, examples of cluster labels can be Brexit, elections, bank, etc. The goal of document clustering is to automatically group a set of news about similar events into subsets or clusters, so that articles within a cluster have high similarity, but are very dissimilar to articles in other clusters. After identifying the news that the user has read recently, the system proposes similar news belonging to the same clusters associated with the reader. The items found correspond to short-term recommendations.

Georeferenced Recommendations
The most noticeable benefit of using location information in the area of recommendation systems is that a history of the user preferences may not be required. The Georeferenced approach recommends news related to the country where the user is located in. For example, if a Spanish user is in England on vacations, the system will recommend the most recent news about England.

Users Experimental Setup
A vast amount of methods to evaluate recommender systems has been proposed [5,55]. It is often easier to perform offline experiments using existing data sets and a protocol that models user behavior to estimate performance such as the prediction accuracy. A more expensive alternative is a user study, where a small number of users are asked to use the system and perform a set of tasks, typically answering questions concerning the quality of their experience. Alternatively, running a large scale experiment on a deployed system (usually referenced as online experiments) enables evaluating the performance of the recommenders on real environments and with users who are unaware of the conducted experiment [56]. Since these studies are usually difficult to implement and may have considerable costs, most experiments are usually restricted to a small set of subjects and a relatively small set of tasks. To overcome limitations and enable providing more accurate results, the test subjects must represent, as closely as possible, the users of the real system.

Participants and Online Resources
Our experiment comprised 37 volunteers, all self-declared readers of the online Publico newspaper. They were recruited through an email, containing evaluation instructions, sent to employees of a research institute. No reward was provided to the participants. The entire evaluation process was anonymous resulting in unknown participants' personal information. However, given the characteristics of the organization, it is expected that most of them were within the range of 25 to 40 year-old. The testbed included two tasks/experiments that the users were asked to perform in sequence. All the 37 subjects completed experiment 1. However, three of them gave up before performing experiment 2, lowering the number of participants in the second phase to 34.
News used for the experiment were fetched at each moment from the newspaper backend through an API made available for the experiment. This enabled creating a customized user interface with the layouts required to analyze the research questions while still assuring the information presented was dynamic, up-to-date and in line with the actual online newspaper.
Three different reading profiles prior to the experiment were identified: (1) Seasonal readers that had only read a very small number of news; (2) average readers having an intermediate amount of news accesses; (3) intensive readers, which have a high reading historical profile and are probably the kind of reader that consumes news every day. This characterization was made by automatically collecting cookies' information from their browsers. The final group of volunteers included 20 seasonal readers, 10 average readers and 7 intensive readers.

Research Methodology
The volunteers were asked to perform two different tasks. The first experiment consisted on presenting the user three lists of news corresponding to: (1) the output of the recommendation system, (2) the most popular news, and (3) randomly selected news. All the news were retrieved from the newspaper backend using the API provided. Items in the recommended list were selected from the ones available in the backend according to the recommendation algorithm suggestions. To keep a parallel with the real newspaper environment news were presented in the same exact way as they were displayed in the newspaper webpage which may had included an image, a title and/or the name of the section it belongs to. By clicking on one of the news, the user would have access to the full text. Each list contained five newspaper news, horizontally aligned. The ranking position of each list on the screen display was randomly generated in order to dismiss possible position influences, thus defining six experimental conditions ( Table 1). The only difference between these conditions was the position of each type of list (Recommended, Most Popular, or Random) on the screen. As delineated in Table 1, each type appeared once in each of the six possible positions. Each participant was randomly assigned to one of the six conditions. From Table 1, it is seen that the experimental conditions having the recommended news in the first position were most often tested. The participants were then asked to choose the list that best matched their preferences. The main purpose of this experiment was to evaluate the impact of the recommendation algorithm on the users' satisfaction in comparison with the other two simple approaches. In the second experiment, the user was shown 12 news from the three recommendations approaches (Geo located/Short-term/Long-term) merged into a single list and was asked to rate each one from 1 to 5, according to his/her preferences. The goal of this task was to evaluate the effectiveness of each of the approaches.

Results
For descriptive purposes, categorical variables were summarized by absolute (relative) frequencies and continuous variables by the mean (standard deviation). Chi-squared or Fisher test evaluated the independence between two categorical variables, depending on the situation. The existence of significant differences between the mean rating from experiment 2 was investigated by mixed-effects regression models. Data were grouped by individual and the variability introduced by the (non-fixed) position of the articles in the list was captured by a random effect. An autoregressive correlation of order 1 was found to best fit the data, as the model presented the lowest Akaike Information Criterion. Multiple comparisons between the three groups used Tukey contrasts. The significance level was set at 0.05. RQ1 asked about the percentage of readers selecting the recommended news from a set of several alternatives. The list including news from the recommendation algorithm was chosen most often, with 48.6% of the total number of selections, followed by the list corresponding to the most popular news (27.0%) and the list with news chosen at random (24.3%).  Regardless of the order of the alternatives on the screen display, the recommended list was, on average, more often chosen than the other alternatives once it obtained the highest values for the choice probabilities (Figure 2). The chi-squared test marginally rejected (p = 0.052) the null hypothesis of a uniform distribution for the choice probabilities, represented by a horizontal line in the figure.
RQ2 assesses whether the position where the information is displayed, in the screen layout, influences user selections. Figure 3 presents the observed probability of choosing the first, second, or third list on the screen display within each experimental condition. The figure shows that the lists positioned first and third were most often picked than those positioned second. However, this preference was not statistically significant (p = 0.137, chi-squared test; the horizontal line in the plot represents the uniform distribution) meaning that, as a whole, each position is equally liked by the consumers. This answers RQ2 and it can be said that there was no significant position influence on the user selections.
It can be observed that the largest probability of choice for the first position happened when the recommended list was placed there, i.e., for the condition Rec Rd MP. The largest probabilities of choice in favor of the second and third positions have also occurred for the recommended list but the second largest probability also included a preference for the list with the most popular news.
The participants were then divided into three groups, depending on the number of stories already read in their profile: (1) Seasonal readers that read less than 10 news; (2) average readers that read 10 to 40 news; (3) intensive readers that read more than 40 news As for the possible confounding effect of the readers' consumption profile on the list choice, Table 2 shows the preferences of each profile. It is seen that, regardless of the profile, most of the readers pick the list with the recommended items; even those with no regular reading habits. The Fisher test was not statistically significant (p = 0.637) implying that we cannot reject the null hypothesis that states the independence of the two categorical variables. This answers the first part of RQ5 showing that conclusions obtained for RQ1 are not affected by the reading habits. We now consider experiment 2, concerning a single list of 12 recommended newspaper news. RQ3 and RQ4 question which of the approaches, among short-term, long-term and georeferenced, show the highest preference. Table 3 indicates that the short-term recommendation algorithm collected the highest ratingsthe average (standard deviation) rating for the short-term recommendations was 3.4 (1.1), while for the long-term recommendations the average rating was 3.0 (1.2) and for the georeferenced approach it was 2.5 (1.2). The existence of significant differences between the mean ratings from the approaches was investigated by mixed-effects regression models. All pairs were found to have significantly different means (Short vs. Long: p = 0.044; Short vs. GeoReferenced: p < 0.001; Long vs. GeoReferenced: p = 0.003), with the highest mean rating for the short-term approach.
An additional effect of the readers' consumption profile (seasonal/average/intensive) on the ratings was not statistically significant (p = 0.532, 0.282), nor its interaction. In particular, the lack of statistical significance for the interaction means that the ratings of the readers toward the three recommended approaches are essentially independent from their reading profiles.
RQ5 evaluates if news consumption frequency behavior influences significantly the preferences identified by RQ1, RQ3, and RQ4.
For the most popular and random list, no significant divergence between profiles was noticed: most popular news reached around 30% of the selections, while random news obtained around 15% of user selections. The results presented suggest that there is no significant influence of the length of the users' interactions with the system in the past and the performance of the developed approach that always outperforms the simplest approaches of providing news. Figure 4 illustrates these results. The highest preferences were obtained for the short-term recommendation methodology, independently of the user profile, while the lowest value is associated with the georeferenced approach. However, it is noticeable that for intensive readers, the divergence between the rating average for the short-term approach and the other methodologies is larger than for seasonal or average reader profiles. RQ6 questions if the set of users who do not select the recommended proposals have similar preferences to the ones identified by RQ3 and RQ4. Figure 5 focuses on the average ratings obtained for the different recommendation approaches when seen from users that did not select the recommended list during the first experiment. The results show that, in contrast with those having selected personalized news, users who had opted for the most popular news do not show a clear preference for any of the recommendation approaches. Users who had selected the random list show a slight preference for the short-term approach, followed by the long-term and finally by the georeferenced approach, but the preference pattern is much less evident compared with those who prefer recommended news. Additionally, users who selected recommended news have the lowest average rating value for the news provided by the georeferenced methodology. These results provide some evidence that users who have a notorious preference for news related to their profile, do not show much interest in more general news, such as those generated by the georeferenced approach, which does not differentiate users but just their location.

Discussion
In this study, various research questions have been formulated in an attempt to analyze the newsreaders' behaviors and preferences in the interaction with online personalized news.
The first results clearly suggested that, among a set of news to choose from, users will more likely read recommended news instead of popular news or other generic content that shows up on the screen. Results are consistent independently of the position of the recommendation list in the experiment. This can be interpreted as an indication that if there are recommendations available, information overload does not affect the user capability to locate his/her interests. It seems that, even if the personalized articles are presented in a basic and poor newspaper design, randomly arranged and without any kind of description, users will still feel motivated to choose those articles without knowing they are recommendations.
At the same time, findings show that generic recommendation techniques, as recommending the most popular items, cannot compete with recommendation approaches based on the user profile. The rationale is that a system that routinely recommends popular or common items, even when achieving high numerical accuracy values, can be of little value to users [57]. However, recommending popular news still seems to be a better approach than suggesting news without any pre-selection, since the random news list scored worse. These findings provide some support to the hypothesis that the online newspapers that do not implement or only implement low-level personalization systems will not have as many readings as newspapers having more complex and personalized recommendation systems. This hypothesis is in line with the research of [58] that demonstrated that online product recommendations greatly influence subjects' product choices and could make a product to be selected twice as often if it is recommended.
Our study also shows that there is no significant change in the pattern of results for seasonal, average or intensive readers. This is highly encouraging, since it suggests that regardless of the number of interaction of a user with the system, or whatever his news consumption behavior is, recommendation systems can be effective in increasing the perceived attractiveness of online news services. Table 4, summarizing results from experiment 2, shows that, regardless of the consumption history, there was no change in the user preferences pattern, indicating that the short-term recommendation methodology provides the highest preferences (rating average). These results demonstrate the users' higher interest for news similar to those they have read recently when compared to personalized news based on their long-term profile. This conclusion is in line with [46] that analyzed anonymized Google News click logs. It is noticeable that for intensive readers there is a more evident preference over short-term recommendations when compared with results from the other methodologies, showing a stronger preference for news that capture their recent interests. The potential explanation for these general findings is that users who read news intensively and with a regular frequency can have their preferences better defined. Since the profile of this class of readers is created from a larger amount of collected information, their long and short-term profiles can be better distinguished. The low performance of the georeferenced algorithm may be due to the use of only generic information about the country where the user is located in. This is a current limitation that might be improved in the future if additional information is introduced by the news provider, including, for example, region-based or town-based locations. Given that, in the testbed, the users were located in the same country, this country-only-geolocation limitation may have had an impact on the evaluation. Having a broader representation of geo-locations among the testers will be considered in a future experiment Considering the first evaluation experiment, an additional analysis associated with the question on whether the users who do not select the recommended list have the same recommendation approach preferences as those who selected the recommended list, demonstrated that different types of news selection reflect different preference patterns. The findings from this analysis suggest that there is an influence in the average rating assigned to the recommendation approaches in the second evaluation experiment depending on the list that was selected in the first experiment. In particular, users who had selected most popular news do not show a clear preference for any of the recommended approaches. This can be interpreted as an indication that, in some cases, the social influence represents a crucial role in user readings decisions [59]. Popular news articles are usually the most commented, shared in social networks, and also most discussed with others. Users who matched their preferences only with the crowd will present a profile that is mainly dependent on the preferences of other newsreaders. In such cases, the users will be unable to demonstrate an evident interest by any of the recommender approaches, since none of them corresponds to their real interests.
Results also show that users who have selected recommended news are the ones less keen on the georeferenced methodology. This provides some evidence that users who have a notorious preference for news related to their profile do not show much interest in broader news, such as those generated by georeferenced approach.

Conclusions
The findings from this study provide important information about the impact of online news personalization and on how some factors may affect the creation of positive perceptions. It can contribute to improve both the recommendation approaches used by news providers as well as the design of online layouts. Some improvements can still be considered, the most important being setting up a large scale user test. The main problem for this additional validation is related with difficulties on implementing non-intrusive experiments that might otherwise result on user discontent when using the service. Additionally, as the current implementation relies on the usage of cookies, which was acknowledged and consented by the users, having a less controlled and larger experiment with users might require using a different methodology. The geo-referenced recommendations approach can also be improved by identifying additional aspects described in the news (e.g., strikes, storms, terrorism attack, etc.,) that might have an impact in the traveling plans of a user. Incorporating contextual information, such as social, or personal context, into the recommendation process and studying the influence of news recommendation on a reader's credibility perception will be considered for future research.
Funding: Paula Viana and Márcio Soares were partial supported by Project "TEC4Growth-Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER-000020", under Research Line FourEyes, financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF). Paula Viana has also been supported by National Funds through the Portuguese funding agency, FCT -Fundação para a Ciência e a Tecnologia, within project UIDB/50014/2020. Rita Gaio was partially supported by CMUP, which is Financed by national funds through FCT-Fundação para a Ciência e a Tecnologia, I.P., under the project with the reference UIDB/00144/2020. Amílcar Correia was partially supported by the Project Pglobal (Nr. 2014/38592-Programa Operacional Temático Factores de Competitividade/Programa Operacional do Norte, Funded by ERDF).