Next Article in Journal
Security in Wireless Sensor Networks: A Cryptography Performance Analysis at MAC Layer
Previous Article in Journal
Missing Data Imputation in the Internet of Things Sensor Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive User Profiling in E-Commerce and Administration of Public Services

by
Kleanthis G. Gatziolis
,
Nikolaos D. Tselikas
and
Ioannis D. Moscholios
*
Department of Informatics and Telecommunications, University of Peloponnese, 221 00 Tripoli, Greece
*
Author to whom correspondence should be addressed.
Future Internet 2022, 14(5), 144; https://doi.org/10.3390/fi14050144
Submission received: 6 April 2022 / Revised: 3 May 2022 / Accepted: 4 May 2022 / Published: 9 May 2022
(This article belongs to the Special Issue Automating Process of Big Data Analytics Using Service Composition)

Abstract

:
The World Wide Web is evolving rapidly, and the Internet is now accessible to millions of users, providing them with the means to access a wealth of information, entertainment and e-commerce opportunities. Web browsing is largely impersonal and anonymous, and because of the large population that uses it, it is difficult to separate and categorize users according to their preferences. One solution to this problem is to create a web-platform that acts as a middleware between end users and the web, in order to analyze the data that is available to them. The method by which user information is collected and sorted according to preference is called ‘user profiling‘. These profiles could be enriched using neural networks. In this article, we present our implementation of an online profiling mechanism in a virtual e-shop and how neural networks could be used to predict the characteristics of new users. The major contribution of this article is to outline the way our online profiles could be beneficial both to customers and stores. When shopping at a traditional physical store, real time targeted “personalized” advertisements can be delivered directly to the mobile devices of consumers while they are walking around the stores next to specific products, which match their buying habits.

Graphical Abstract

1. Introduction

The Internet today is a technological and social phenomenon. It affects everyone’s daily life and has had significant social impacts. Huge amounts of data and information are being uploaded to the internet every day. Businesses want to maximize their profits by advertising their services or products to targeted customers, while Internet users want to avoid receiving irrelevant information from Internet search results. It is necessary to predict users’ needs to improve their browsing experience and provide them with valuable data. The solution to both problems described above is web personalization via user profiling [1,2,3].
A User Profile is a group of items and/or patterns used to describe the user briefly. User Profiling is an especially critical procedure for e-business systems that captures online users’ attributes, knows online users, provides tailor-made goods and services, and therefore improves user satisfaction.
To conduct our research, we contacted the major superstores in Greece, asking for information on the way they have created their online user profiles. Our results show that while stores do allow users to register and create new profiles, there are times when customers provide false data. This problem can occur when no online verification process is in place. So, a question we must investigate is: which registered customers are supplying accurate online information?
“User profiling techniques have widely been applied in various e-business applications, e.g., online customer segmentation, web user identification, adaptive web site, fraud/intrusion detection, personalization, e-market analysis, recommendation, as well as personalized information retrieval and filtering” [4].
User Profiling can be defined as the course of pinpointing the data about a user interest domain [5,6]. This data can be used by the system to grasp more about the user and be further utilized to better meet the user’s needs.
In this article, we propose the implementation of an online profiling mechanism in a virtual e-shop, its success rates, and how neural networks could be used to predict the characteristics of new users. We also indicate the way our online profiles could be of benefit both to customers and stores through real time “personalized” advertisements targeted at customers shopping in physical stores. The proposal of this article is significant since it could redefine the way we shop at physical stores. If the real online profiles of the consumers are known, then we could use them to promote in real time, specific products to certain customers while shopping. A lot of research has already been conducted both on the techniques of user profiling in online shops and the techniques of user profiling in physical shopping, so the main objective of this article is to fill in this research gap by joining these approaches in order to increase the profits of businesses and the affordability for customers through personalized price offers.
The rest of this paper is organized as follows. Section 2 reviews some related work and introduces the theoretical basis. Section 3 describes our proposed model, and Section 4 describes the experimental setup as well as the results. Finally, Section 5 concludes the paper.

2. Related Work

2.1. User Profiling

A user profile is a visual representation of the personal data associated with a particular user, or a customized interface [7]. That is, a profile is the digital representation of an individual’s identity. However, it can also be considered as the representation of a user model.
A profile stores the description and characteristics of the individual it represents. These facts can be utilized by various systems that take into account people’s attributes and preferences. This is why profiles are essential for a modern system, as the information found in the profile is personalized, thus enabling us to distinguish and group them.
There are two phases which allow us to acquire the user profile. In the first phase, the user is asked explicitly to insert his/her initial profile as a goal. He/she can also amend the profile by hand. Users may not be able to enumerate all their interests at once. So, their browsing history is used to update their profile. The second phase (user profile acquisition) monitors the browsing behavior of the user, and through the scheme of content analysis, the data of the user’s interest are successively acquired.
The information contained in a profile can be either dynamic or static. In the first case, the profile is called dynamic, and this means that the information can change over time [8,9]. These changes usually occur depending on the actions that the user takes in the system and usually they cannot use or make changes to this information. In contrast, in the second case, where the profile is called static, the information in the profile remains constant for a long period of time and it rarely changes [8,9]. Such a profile will contain mainly demographic notes about the user, such as name, age, height, etc. In many systems, a combination of the advantages of static and dynamic can be observed, thus making the profile hybrid [5,10]. Profiles can be found in operating systems, computer programs, recommendation systems, computer games, etc. [11].

2.2. Profile Structure

According to the previous description referring to the characteristics of the user’s profile, we can divide the profile into subcategories, namely, the basic and the extended profile, respectively [12]. The virtual identity is the first thing that the user selects, and it refers to the user’s ID. This identity is permanent and does not change, whereas it is the user’s choice whether he wants a pseudonym or his real identity. The basic profile is the one containing the user’s very basic information (demographic data) and can usually be altered, although rarely, in accordance with the user’s needs.
The extended profile contains information that changes over time and is not specified when the profile is created. The information can be changed, or new information can be entered, making the profile dynamic. Interaction with third-party profiles and policies requires settings related to data security and user privacy as to who can use this information. As all these features form the structure of an integrated profile, there are also different profile design patterns or often a mixture of these patterns.
Static models are the basic types of user profiles. In them, the main data are collected and will not change again, i.e., they are static. Changes in the user’s choices are not registered in the system and no algorithms are used to parameterize the profile.
Dynamic models allow a more up-to-date representation of users. Changes are often made to them over time and through the user’s interaction with the system. These profiles are particularly useful in adaptive hypermedia as they are updated to take into account the current needs and goals of the user.
Hybrid models are those that combine static and dynamic models according to the needs of the system.

2.2.1. Profile Monitoring

In order to analyze a profile, it must first be extensively monitored and all the user’s actions over time must be recorded [13]. Monitoring a profile consists of three processes:
-
Direct monitoring of the use of the application by keeping a history of the usage pattern.
-
Storing the history by the system to avoid failures.
-
Immediate feedback on the performance of the service.
Of course, this information is particularly valuable, as the risk of user privacy violation is high, and therefore, this matter raises ethical and legal issues regarding privacy monitoring [14].

2.2.2. Data Collection

After having created a user profile, the next step is to collect information about the user so that it can eventually be analyzed. There are several ways to collect information about users, with some of them discussed below [15].
The easiest and quickest way to collect information is through direct user interaction with the system, where the latter is asked to answer a series of questions that will help the system “learn” about him/her. This process usually takes place during registration with the system, at which point the user is asked to fill in forms or other interfaces that serve this purpose. Usually, this is an optional type of intelligence as users may not be willing to fill out lengthy forms, and this information rarely changes over time. In general, this information is comprised of demographic details, such as the user’s age, marital status or sex.
However, there are several problems with collecting information in the first way, as users may not want to provide much data, and this has led to the creation of a second way which learns the user’s preferences by observing the user interacting with the system. In this case, the system does not automatically request information about preferences from the user. Instead, it comes as the user navigates through the system and is subconsciously asked to make some decisions. Thus, the system learns dynamically from observing their interactions. For this reason, for the system to learn about a profile, the user’s behavior should be repetitive, i.e., the user’s actions should be performed under similar conditions at different points in time.
There is also a third hybrid mode which is a combination of the two above [16,17]. That is, data are collected not only by asking the user to answer questions directly, but also during the user’s interaction with the system. This mode combines the advantages of the two previous ones, thus making it ideal for most profiling systems.
Each method has its advantages and disadvantages. The first method is usually the best when data need to be collected quickly, but there are several problems. First, it lacks the ability to adapt to changes and user preferences. Secondly, it is highly dependent on the user’s willingness to provide the information and it is likely to become invalid after a period of time. Third, users may not write true information on the forms and those who are willing to provide true information may not know how to express their interests. However, users have full control over the information collected and it is their decision what they want to share with the system.
In the second method, the information is gathered by observing the user’s movements in the system, so it takes more time to gather information, and this information cannot be changed or seen by the users. Moreover, if there is no repetition in the user’s actions, the pattern cannot be discovered. However, this information can be easily and automatically changed so that the system is always aware of and more accurate regarding the user’s preferences. This could be a simple case of using cookies to store and track visits from particular users, including the pages and products viewed, or it could be something more advanced such as eye movements, or even motion detection [18].
Cookies could be used to save some basic information and preferences about users, such as their individual login information or favorite sports or politics. They could also be used for personalization issues. As customers are browsing in e-shops and viewing certain items or parts of a site, cookies could be used to help build targeted ads. Finally, cookies could be used to track items users previously viewed, allowing the e-shop sites to suggest similar goods they might like and keep items in shopping carts for future reference.
However, we must keep in mind that cookies have some negative aspects as well. Many users regularly delete cookies from their browsers. Others will not allow cookies to be stored on their machines for security reasons. There are some privacy aspects to be taken into consideration too. Third-party cookies are generated by websites that are different from the web pages users are currently surfing. This is because they are linked to ads via that page. An e-shop with 20 banners/advertisements may generate 20 cookies, even if users never click on those ads. These cookies could let advertisers or analytics companies track and analyze an individual’s browsing history. Finally, as mentioned in the above paragraph, we cannot store advanced information in cookies about customers such as eye movements, or even motion detection. Consequently, it is better and more secure to store user’s profiling details in a server-recommendation system.
For all the above reasons, we chose for our implemented recommendation system to use cookies to store only some basic information about users such as their login data, and we keep all the important details and the analysis of the customers such as parenthood, gender, interests, etc., in our system.
The hybrid method attempts to combine the advantages of the first two methods by directly asking users to provide as much information as possible, and then the system, observing their interaction, adjusts the user’s profile according to their preferences. In Table 1, a comparative list of profile types in relation to the researched literature is presented.

2.2.3. Data Analysis

Data analysis is a process for inspecting, cleaning, transforming and modeling data in order to discover information that is useful for decision making by users. Data analysis can be distinguished into several phases as shown below [19].
Data collection as presented is next to the requirements that are determined based on those that guide data analysis.
Data processing includes the phases where raw information is processed and converted into information which is ready to be analyzed. This may involve entering data into rows and columns in a tabular format, such as a spreadsheet or database.
Data modeling is the process wherein mathematical formulas or algorithms are applied to the data to display the relationships between variables so that the information can be ultimately visualized to be understood by the user.
However, all of the above depends on the initial phase of data analysis which consists of four questions. These questions have to do with the quality of the data, the quality of the measurements, data transformation and whether the collected information meets the requirements of the survey design [20].

2.3. User Modeling

User modeling is a part of human–computer interaction and describes the process of creating and modifying a user model [21]. The main goal of user modeling is to adapt systems to the specific needs of the user. The system must appear to be built for each individual user, while it is built for hundreds of millions of users. That is, it should say “the right thing, at the right time, in the right way” [22].
User modeling consists of two main categories. The first is the user model, which is the set of information that makes up the user profile, and the second is data collection. The set of information that makes up the profile is all the data that make the profile distinct from the rest. Data collection is also a separate chapter in itself, as through it we can extend the information we have about a user either by asking the user to provide it or by tracking the user’s actions in the system. The latter is extremely important for a system that can adapt to the user’s needs [23].
A very simple example of user modeling is e-commerce websites that use all the information about a user’s browsing and shopping and combine it with information from other users in order to better understand their shopping preferences. Thus, the system can easily suggest possible products that may be of interest to users.

Types of Data in User Models

User data includes data about users’ interaction with the system [24]. Thus, each user is made according to this data and is made to stand out from the rest. The following are the types of data that can be incorporated into user models.
Demographic data has information about the first name, last name, age, height, weight, gender, nationality, place of residence, etc. These data can be expanded and modified to a huge extent depending on the requirements of the application. Usually, they form the static part of the profiles as this information changes very rarely to never. By looking at these elements, we can group the users of the system according to their profile and look at their actions individually. This, again, could be useful in an e-shop system as, for example, we could look at the shopping preferences of the two genders separately.
Knowledge or background data is perhaps one of the most important in user models. These data are usually not subject to frequent changes, and they are determined in the short term, thus forcing systems to be dynamic. This means that the system should understand the changes in knowledge acquired by the user by observing the user’s movement and choices in the system and adjust the data to make it more useful to the user.
Interest and preference data are the most important pieces of information in systems that filter information, such as recommendation systems. However, it is usually different from demographic information, as the user does not need to be asked about it. Instead, by observing the recurring patterns in users’ actions, an ideal system could infer the user’s interests on its own.
The user’s individual traits are the set of user characteristics (extrovert, reactive, etc.) that are not subject to any change or that change over a long period of time. That is why many such systems with this kind of information can be static. Examples of such systems are specially designed psychological tests. As before, this information differs from demographic information, as here too it is particularly important to observe recurring patterns in the actions of users.

2.4. Uses of User Model Data

We have analyzed the profiles and the information that populates them. A modern profile should have information that has been gathered either dynamically or statically and this information should form a personalized profile of the user. Once a system has gathered information about users, it can begin to present the data or even use it to its advantage. Profiling can be used, with many important benefits, in several applications, some of which are presented below.

2.4.1. Experienced Systems

Experiential systems are computer systems that can mimic human decision-making to help solve a problem in a particular area. These systems work by asking questions step by step to pin down the issues that come up and find solutions [25]. User models can be used to comply with the user’s current knowledge and differentiate between experienced and novice users. The system is able to conclude that skillful users are in a better position to understand more complex queries than someone who is new to the domain. Thus, it adapts its vocabulary and the queries it uses to find a solution.

2.4.2. Recommendation Systems

Recommendation systems are application tools and techniques that give suggestions for objects that a user might want to use. These recommendations may be decisions that the user wants to make, such as: which is the best purchase, what kind of music he/she would like to listen to, or what news to read [26].
The basic idea is to present a selection of items that best fits the user’s needs, which are determined based on analysis of the user’s profile during profile creation or while navigating the application.
Recommendation systems have become prevalent nowadays and are widely used in a variety of applications. The most popular applications are probably movies, music, news, books, research articles, search engine queries, products, etc. A typical example of a recommendation system is the www.stumbleupon.com (accessed on 5 April 2022) website system, which uses the web ratings gathered by a collaborative rating system that can match users with interesting websites based on their preferences.
For example, for two users with the same preferences, a recommendation system is capable of suggesting something that may be of interest to the second user, depending on the data provided from the first one. Figure 1 shows two people with the same preferences (they look almost the same, they have similar ages, they are of the same gender, they probably like similar clothes) and how a recommendation system is capable of suggesting something that may be of interest to User B based on the data provided from User A.

2.4.3. User Simulation

Since modeling a user lets the system perform an internal representation of a particular user, user simulation allows us to perform usability testing. These tests involve a process used to evaluate a product by testing it on these users, thereby providing the basic idea of how real users would use the system, and the tests focus on measuring the ability of a product to satisfy someone [27]. A few striking examples of goods that profit from these tests are websites, food, consumer products, computer interfaces, etc.

2.5. Knowledge Extraction

Knowledge mining in Computer Science (also called knowledge discovery in databases), is the process of detecting interesting and useful patterns and pertinence in great numbers of data [28]. The field of knowledge mining combines artificial intelligence tools and techniques with database management and is widely used by businesses (insurance, banking, etc.), in scientific research (medicine, physics etc.) and in government security systems (criminality and terrorism actions). Thus, using clustering or categorization algorithms, data are extracted to help humans make appropriate decisions.
Companies’ transactional data have significantly increased; thus, the demand for more sophisticated systems capable of discovering the knowledge contained within that data has come to the foreground. A successful application of data mining was the detection of credit card fraud. The system studied the consumer’s buying behavior and displayed a pattern for them. Any purchase made outside this pattern led to an investigation.
The complete data mining process involves multiple stages, which are information gathering and pre-processing, in which, before the data mining algorithms are applied, the surveyed set of information is assembled. Then, the data are processed, which enables data mining and results in the interpretation of the database. To achieve the aforementioned process, there are some techniques which are discussed below.
Predictive modeling is used when we aim at estimating the value of a particular feature and we know some of the values of the attribute. An example is data classification, which gathers a group of data that have been sorted into predefined sets and looks for patterns in the data that differentiate these groups. These discovered patterns can then be reused to classify other data when the name for the group attribute is unknown. For example, a manufacturer may develop predictive models to distinguish which parts fail in extremely hot or cold temperatures.
A second technique is descriptive modeling or clustering, which also subdivides its items into groups. With arraying, the appropriate sets may not be known in advance, but they are discovered after analysis of the data. For instance, an advertiser may interpret a general population in order to categorize plausible consumers into many kinds of groups and then develop separate advertising campaigns [28]. Figure 2 shows the clustering into groups.
The next data mining technique worth mentioning is pattern mining. This technique focuses on establishing modes that present specific patterns within the data. They are often used in stores trying to find out which products are commonly purchased along with some other ones. Although testing such insights is possible without the help of an application, data mining has facilitated the discovery of associations in less obvious datasets. Figure 3 illustrates in a simple way how the pattern mining technique is used in the data.

2.6. Similar Systems

2.6.1. The WEST System

When analyzing user analysis systems, it is important to refer to early systems that became pioneers in their field. One of these was the WEST system [22].
The WEST system was a tutorial for a game called HowTheWestWasWon. In this game, players spin three spinners and have to create numerical expressions with the numbers spin, using +, −, ×, / and appropriate parentheses to determine what the final value will be. So, if, for example, the player rolled 2, 3 and 4 with the spinners, they could create the numerical expression (2 + 3) × 4 = 20 and advance 20 places. If a player reaches one city (i.e., every 10 places), he automatically advances to the next city, and if he lands on an opponent, then he is sent back two cities. Thus, it makes it an optimal strategy for the user to have to calculate all possible moves that put him ahead of his opponents. By thus analyzing the players’ moves, the system discovered that the most popular strategy was to add the two smallest numbers and multiply them by the largest.
Although the WEST system explored some of the basic concepts of user modeling, due to the limited results, it worked very well by analyzing player behaviors so that they could be understood by users.

2.6.2. The Gumsaws System

The Gumsaws system was created to support the construction of adaptive web pages [29]. This system was able to meet the scalability, replaceability and adaptability needs of a website by modeling users. It did this by using knowledge mining techniques to learn the user’s navigation history.
The Gumsaws system had features to create a profile or group of profiles and to store, retrieve, update and delete entries. These functions were performed by the system using various sources of information, such as direct information which came directly from users, group information which came from users’ navigation history and correlations between them. Thus, the system could be used by news systems and served its users according to their preferences.

2.6.3. The CATS System

The Collaborative Advisory Travel System (CATS) was recommended as a solution to suggest a plan for ski holidays for a group of friends [30]. This allowed a group of users to work together at the same time in order to choose a ski vacation package that satisfied the whole group. The system revolved around the interactive DiamondTouch tabletop that allowed developing group recommendations that can be shared virtually among up to four users. The proposals relied on a group profile which was a mix of personal inclinations.

2.6.4. The PCAHTRS System

The PCAHTRS system is a Personalized Context-Aware Hybrid Travel Recommender System proposed by R. Logesh and V. Subramaniyaswamy [31]. With this system, they tried to propose a way to achieve better personalized recommendations in the e-tourism domain. The main purpose of this model was to design a hybrid collaborative filtering travel recommender system that provides personalized tourist venues based on ratings and desires. It is shown that the form of the implicit and explicit preferences of users extended with the semantic models is the key to uncertainty issues that come up in the recommendation process. PCAHTRS was based on the user contextual information and opinion mining technique to improve accuracy in prediction.

2.6.5. The Hootle

Hootle was a group recommender system (GRS) proposed by JO Álvarez Márquez and J Ziegler [32]. In this system, user preferences and needs were modified in group discussions and users could interact with the desired features of the items. All group members should therefore accept or reject the proposed features and manage group choices according to their importance.

3. Our Proposed Implementation

Artificial intelligence is radically changing our lives and has been around for a long time. Through the COVID-19 pandemic, it has been given a new impetus, since public and private lives are now largely played out online. Any registration system primarily aims at collecting information on site visitors, not only to determine who is coming to the site, but also to facilitate informed decisions concerning the site design and content.
Marketers pay critical attention to customer profile data, which are used to better understand their audience, how they use the website, what products they like, their offline interests, and who is on their social media. The value of the database depends on the quality of the data it contains, and 88% of customers admit that traditional registration forms provide incomplete or incorrect information, so the database does not contain the required quality of data. Poor data quality can result in lost sales, ineffective direct marketing, administrative costs and a loss of 10–20% of annual revenue in avoidable distribution errors [33].
Users need a platform that checks and verifies data provided upon signing up. This will boost the profitability of the business and give consumers a sense of uniqueness by receiving targeted advertising—discounts—and recommended products on the site’s specially designed “personal” page. Generally, users who are already registered do not meddle with updating their profile, since they have already received access to the platform. Additionally, many users who are concerned about their personal information do not include their real personal data online. They intentionally (in most cases) give incorrect information. These fake profiles can be modified or updated with more data using the methods for unregistered users. Given the above, we created a “user profile extraction engine” called Profiler for a virtual web shop. Through this implementation, we can track users’ movements and create their profiles accordingly. Our primary goal was to create and edit a profile for e-commerce purposes.

3.1. The Database

The database is used for the static data of the users entered during registration, the dynamic data entered during their navigation and for the products. The database consists of four tables: members (users), products (products), tracking (tracking) and item bought (purchases).
The users table consists of only three elements: the username, password and an ID for each user. This ID is unique for each user and is the key that connects this table to the tracking table.
The tracking table contains data that attempt to determine whether the user is male or female, whether they have children and what their hobbies are. It also keeps a record of when they last logged in, how many times they have shopped at the store, how much money they have spent and other personal information, if any.
The product table contains one-by-one information and images of the products as well as information that helps the system to categorize the products and answer the queries received from the user during the shopping process.
Finally, the shopping table (items bought) contains information about the purchases made by each user. Figure 4 shows the tables and some of the elements and keys that make up the system’s database.

3.2. User Tracking Technique

The process of user tracking is also the point where profiles are dynamically ‘built’. Every time a user makes a query in the database, the database displays the appropriate products and at the same time notes, by editing the user’s profile, the categories of interest.
PHP was used for server-side scripting and database communication. The dynamic editing of the profile is not visible to the ordinary user but only to the administrator of the website and cannot be edited unless the information in the database is ‘tampered with’.
We mentioned in Section 2.2.1 the ways in which it is possible to monitor profiles. In this application, the ideal way is the second one, i.e., monitoring through the user’s actions. In this way, by observing the recurring patterns of users, the system can adapt to changes in the user’s interests, likes, routines and targets. The only downside is that “building” a complete profile can take some time, and if not given enough time to create some recurring patterns by the user, the data may appear incomplete.
More specifically, the way a profile is tracked has to do with the pages visited in the application. That is, if a user visits men’s products very often, the system will know this and will increase the number of times this user has visited men’s products. All this information is stored and tracked in our system’s databases and not in cookies for various reasons as we showed in Section 2.2.2. By observing the user for some time, the system will have enough information about him/her so that the administrator can distinguish him/her from the others. Similarly, if users are browsing and constantly searching for products or information on pages of our online store that contain items for infants or children, our system also classifies them as potential parents. Thus, our system creates a profile for each registered user, constantly updating it with information related to gender, age, and financial and family status.

3.3. Data Analysis and Display Technique

The final stage is to calculate and display statistics according to the preferences of each individual user. This option is only visible to the application administrator and allows the administrator to search for a user. The application, in turn, searches for the user in the database and all the data that make up the user. It then calculates the data and displays it so that it can be understood by the administrator. The analysis is the process in which the system takes the information where the user was looking at men’s, women’s or parent’s products and their categories and calculates them as percentages according to their choices. The data are displayed through tables where all the categories are displayed, and the administrator can clearly see the demographics and interests of the user.
More specifically, as is shown in Figure 5, the system administrator can see detailed information for each user, such as their username, statistical data on the user’s gender, his/her likes and much more personal information. For example, the user in this example, based on his/her statistical analysis, is 10% male and 90% female, so she is probably a female. There is also a prediction regarding whether this user has or does not have a child. According to the user’s navigations and the percentage of traffic of each sport activity, the administrator can see in percentages whether he/she likes running, football, basketball, gymnastics, tennis, hiking, swimming or cycling. The system administrator also has access to additional information about each user, such as what date the account was created, when the user last logged in, how many times he/she has logged in to the online store since creating the account, how many times he/she has shopped in the store and how much money he/she has spent in total. The personal details of each user are also presented, for example, in which city he/she lives, at which address, his/her e-mail address, telephone number and other address details. Additionally, the administrator can see if there are any discount coupons in his/her profile and a table of all the products he/she has bought in the past. So, the administrator has a complete overview of each user.

4. Results and Discussion

4.1. Testing of the Application with Real Users, Analysis of the Results through Questionnaires and SPSS

As mentioned in Section 3, a profiler prototype has been designed and implemented that takes information and interprets it as logical clusters, which are capable of being interpreted by humans and other appropriate programs that will monitor them.
The application represents an online store (e-shop) of sporting goods. Users log into the system and make their purchases. As users navigate through the e-shop, the system tracks the users’ movements and records them individually. In this way, we are able to understand some preferences of each user and even some personal data, such as their age, their gender or even if they are parents.
At the end of the visit of the users or potential buyers of the online shop, the users are asked to fill in a questionnaire. The questionnaire contains the same questions for all users and helps us to verify and check the validity of the information and data extracted by the user analysis system.

4.2. European Data Protection Regulation

The information collected is very personal and there is a risk of violation of the user’s privacy. There are legal and ethical issues regarding the surveillance of people’s privacy. The Data Protection Authority, also known as the General Data Protection Regulation (GDPR), is a constitutionally independent administrative authority. It was established by a law for the protection of every person from the processing of data concerning personal data, which incorporates a European Directive into Greek law [34]. This directive sets certain rules for the protection of personal data in all member countries belonging to the European Union. In our developed system, we respect and protect the privacy and the free development of the personality of each user, since this is a primary objective of any democratic society.
Any electronic application should maintain and establish a level of security and protection that is on a par with that of existing services, but at the same time capable of ensuring that personal data is used in a lawful and transparent manner in the interest of citizens–consumers. Due to the provision of electronic services, citizens who use them disclose personal data; thus, there is electronic collection and processing of important information about each citizen, which can be used to create an extensive profile or help unauthorized persons to access all the information. As Lopes H, Pires IM, Sánchez San Blas H, García-Ovejero R, Leithard write in their article, “Data privacy has had a vast prominence in society. Several approaches are taken to realize the dream of one day. There could be a world in which there is a real state of privacy for the individual” [35].
All online applications of any institution must inspire security during transactions, as it is vital that citizens/business users have confidence in the systems used by the public. Trust is consolidated by the existence of appropriate mechanisms for user identification, security and protection of personal data. Users should be made aware of how their personal data are protected and how risks arising from malicious actions by third parties are addressed, such as in cases of hacking of personal data, unauthorized use of services, unauthorized access to data, etc.
Directly intertwined with the security of Public Websites is their reliability and their acceptance by visitors–users. They should provide satisfactory security and reliability, ensuring the following parameters:
-
Integrity: which refers to ensuring that the information that is handled, published, stored and processed remains unchanged.
-
Identification: which refers to the identification of the user’s identity;
-
Confidentiality: which refers to access to information only by those who have the appropriate authorization.
-
Authentication: refers to the specific action that ensures that the identity declared by the user actually corresponds to the user.
-
Authorization: which refers to ensuring that each entity has access to those system resources to which it has been granted access.
-
Availability: relating to the availability of information whenever an authorized user attempts to access it.
-
Non-repudiation: which refers to the inability of a user to deny that he/she has performed an action related to accessing, entering and processing information. The security of public websites consists of a complex set of guidelines and rules relating to the organization of the website operator and the hosting provider, the procedures it applies, the services it provides, the technical infrastructure at its disposal and, finally, the legal framework for the protection of personal data and the security of communications.
Unfortunately, however, the preceding analysis has shown that, from a legal point of view, there are many different issues that need to be addressed immediately and specifically. Among the most important issues are undoubtedly those relating to data security and, more specifically, the issues relating to the authentication of the identity of the communicating parties, the integrity of the data transmitted, the confidentiality of the data from possible unwanted disclosure to third parties and the non-derogability of the data.
In order for any public or private agency to proceed with lawful processing of citizens’ personal data, it should, for example, have collected the data in a fair and lawful manner, for clear and defined purposes, the data should not be more than necessary and should be accurate and up to date. In conclusion, we must point out that if the challenges are overcome, Data Security–Legal Aspects will evolve the World Wide Web into a Web with many new possibilities and will greatly affect many of the activities of our daily lives.

4.3. Statistical Analysis of Data

For the purposes of this article, the statistical program SPSS was used to group, compare and draw conclusions about the quality and reliability of the information produced by the user analysis system.
In our sports e-shop, the adaptive profiling system that we created holds information and analyzes and makes predictions regarding the following categories:
-
Hiking
-
Swimming
-
Running
-
Cycling
-
Football
-
Basketball
-
Gym
-
Tennis
-
Sex (Male or Female?)
-
Parent (Is this user a parent?)
Accordingly, variables for the same categories were used for the “real” data provided to us through the questionnaires. One hundred adults from all educational levels completed the questionnaires after having made some virtual purchases in our online store. The questionnaire consists of 11 questions, and provides data about respondents from different points of view, such as sex, age, interests, parenthood, education, etc. The selection of these individuals was random. The purpose of this survey was to collect, per user, his/her personal data and his/her interests and to subsequently compare these data with those recorded and predicted by our online profiling system. The results of the survey were very encouraging and showed that our system in most cases worked extremely well. Detailed examples are presented below. More specifically, the questions they were asked to answer were:
  • Question: Which username did you use when you registered?
    This question was asked to know exactly which username he/she used when he/she created the account in our system so that we can compare our findings for that specific user.
  • Question: What is your gender?
    According to the replies to the questionnaires, 57 were males and 43 were females. Our online profiling system successfully predicted the gender for 84 of those users (47 males and 37 females). This means that the success rate of our system for the gender reached a percentage of 84%. In Table 2, the success rate of the gender prediction is presented.
  • Question: Are you a parent?
    Of the participants, 32 replied that they were parents and 68 replied that they were not. Based on the findings of our system, it predicted the correct parenthood for 49 of those users. In Table 3, the success rate of the Parenthood prediction is presented.
  • Question: What are your interests? Choose the ones that interest you (Running, Football, Basketball, Gymnastics, Tennis, Hiking, Swimming, Cycling)
    In this question, users had the choice to pick any activities that they really like. For each one of these activities and for every user, we analyzed the findings of our profiling system. It turned out that the system worked very well and made accurate predictions. In the following tables the success rates of each activity is presented.
In Table 4, the success rate of the Running activity prediction is presented.
In Table 5, the success rate of the Football activity prediction is presented.
In Table 6, the success rate of the Basketball activity prediction is presented.
In Table 7, the success rate of the Gymnastics activity prediction is presented.
In Table 8, the success rate of the Tennis activity prediction is presented.
In Table 9, the success rate of the Hiking activity prediction is presented.
In Table 10, the success rate of the Swimming activity prediction is presented.
In Table 11, the success rate of the Cycling activity prediction is presented.
The following are generic questions that we included in our questionnaire targeting to analyze how concerned the users are for their online profiles. Critical assumptions emerged from their answers.
  • Question: Would you be comfortable if you knew that an online store records your movements on it, and “creates” your shopping profile, in order to offer you in the future better services and special individual offers for your needs? e.g., to offer you a big discount on certain products that it “knows” you like?
    Of the responders, 49% said that they would feel comfortable knowing that their movements are recorded in their online shopping profile and 37% replied that they maybe would be. This means that almost 85% of us are aware that all of our online transactions are recorded and stored in our profiles. It is very important for all this personal information to be used for the right purposes. Nonetheless, it is that risk of violation of the user’s privacy that made the remaining 15% feel uncomfortable about the exposure of their online profiles.
  • Question: How much money per visit are you willing to spend on an online store per visit?
    Of the users that replied, 32% that they would spend more than 50 and less than 100 euro for their online purchases. Another 24% responded that they would spend more than 100 and less than 150 euro, and 23% responded that they would spend less than 50 euro. This means that online customers are afraid of spending a lot of money online to buy their goods. This is probably because they are afraid that their personal data and their credit card details will be exposed.
  • Question: How often would you buy from an online store?
    In this case, 44% of the users replied that they often buy from online stores. Another 26% said very often, 28% not often and only 2% replied that they would never buy from an online store, which means that the majority of the people today are using the Internet to buy products.
  • Question: Do you have any concerns when shopping online?
    In response to this question, 56% said no, and 44% said yes. If the risk of users’ privacy violation is reduced, then it is certain that more customers will be less concerned when shopping online.
  • Question: How many times have you purchased products online in the last year?
    In response to this question, 37% replied that they’ve made fewer than 10 purchases over the last year, 31% more than 10 and fewer than 20 and 15% responded that they have bought more than 50 times online. These numbers are expected to increase, since we will all find relevant products at better prices through profiling systems.
  • Question: Age in years?
    Among the users, 39% were in the 18–29 age group, 27% were between 30 and 39 years old, 16% were more than 40 and less than 50 and the rest were above 50. Younger people tend to use the Internet more often for all their transactions.
  • Question: Educational Profile?
    Of the users, 26% were high school graduates, 27% were university graduates, 16% were graduates of TEI (Technological Educational Institute), 12% possessed a master’s degree and 9% were PhD graduates. The remaining 10% possessed lower levels of education, such as high school or primary school. The majority of our users were adequately educated.

4.4. Use Neural Networks in Predictions of Our Users

Predicting user preferences with neural networks is a new trend in e-commerce systems. We could check and verify the data given from our users during their registration in our systems and, of course, we could alter their data. Marketers also find value in customer profile data, which they can use to interpret their buyers’ mode, how they are using the website, what goods they like, their interests while being offline, and who is partaking in their social networks [36]. This will enhance the revenue of the store and individuals will receive targeted advertisements for discounts and recommended products on their personal designed site, so that they can have a unique experience. A registered user will not update their profile for various reasons. They already have access to the required platforms, so they do not need to update their profiles. In addition, many users who are concerned about their privacy do not sign up with their actual personal data online. They deliberately (in most cases) provide false information.
Thus, neural networks can be used to alter or even complement these fake profiles. “A neural network is a network or circuit of neurons, or in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus a neural network is either a biological neural network, made up of biological neurons, or an artificial neural network, for solving artificial intelligence (AI) problems” [37].
“Computer scientists have long been inspired by the human brain. In 1943, Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, developed the first conceptual model of an artificial neural network. In their paper, “A logical calculus of the ideas imminent in nervous activity”, they define the concept of a neuron as a single cell living in a network of cells that receives inputs, processes those inputs, and generates an output” [31].
“One of the key elements of a neural network is its ability to learn. A neural network is not just a complex system, but a complex adaptive system, meaning it can change its internal structure based on the information flowing through it. Typically, this is achieved through the adjusting of weights. Each connection has a weight, a number that controls the signal between two neurons. If the network generates good output, there is no need to adjust the weights. If the network generates poor output then the system adapts in order to improve subsequent results” [38].

4.4.1. Steps in Implementing a Neural Network

A neural network is carried out in two steps:
  • Feed forward:
In a feed-forward neural network, there is a group of data on features and some random weights. As such, we take these random weights which we optimize by back propagation.
  • Back propagation:
In backward propagation, we compute the errors between the estimated output and the target output, and then we update the weight values using an algorithm (gradient descent).

4.4.2. Why Is Back Propagation Needed?

When designing neural networks, a model must first be trained and a specific weight assigned to each of these inputs. This weight determines how important this feature is to our predictions. The higher the weight, the greater the importance. However, we initially cannot know the specific weight required for these inputs. Therefore, we designate a random weight to our inputs, enabling our model to measure the prediction error. Consequently, we revise our weight values and run the code of the neural network again.

4.4.3. Sigmoid Function

The sigmoid function acts as an activation function during training of the neural network. Typically, we use neural networks for classification. In binary classification, there are two types, 0 and 1. However, the resulting value can be any possible number from the formula we use. We use the sigmoid function to solve this problem. As for the classification, we want our output values to be 0 or 1. The sigmoid function changes our output values from 0 to 1. A sigmoid function is a mathematical function that has the characteristic equation S or the sigmoid curve.

4.4.4. Coding and Training a Neural Network

We used Python and NumPy, a popular and powerful computing library for Python, to do the math and write the required code. First of all, we pass data-info to our system. Training of the system comes next, and then we check and verify some scenarios. Based on mathematical and statistical operations, the system replies if an assumption can be given or not. For the purposes of this article, we only tested for the parenthood of our users and their gender (male or female). Similarly, we could write code to test all the other characteristics of our users, i.e., whether they like sports or what their age is.
The behavior of an artificial neural network depends on the weights and input/output functions assigned to the cells. The output of the sigmoid cells is constantly changing with the input, but it is not simple. Sigmoid units are more like real neurons than linear or threshold units, but these units should be considered a rough approximation. You can use the following procedures to teach a tertiary network to perform specific tasks:
  • Determine how close the actual neuron is to the output of the network and compare it to the applicable output.
  • Change the weights of each connection so that the network produces a better approximation of the desired output.
  • To train a neural network to perform a specific function, it is necessary to adjust the weights of each unit to minimize the error between the expected output and the actual one.

4.4.5. Key Points of the Code and Screenshots of the Outcomes

The logistic sigmoid function defined as (1/(1 + e−x)) takes an input x of any real number and returns an output value in the range of −1 and 1.
# Sigmoid activation function: f(x) = 1/(1 + e−x)
return 1/(1 + np.exp(−x))
The derivative of the sigmoid function is:x
# Derivative of sigmoid: f′(x) = f(x) × (1 − f(x))
fx = sigmoid(x)
return fx × (1 − fx)
After the feeding forward of our neural network and the training, we updated the weights and the biases, and after calculating the total loss of each cycle inside our neuron, some predictions can be made, such as the following:
In Figure 6, it is shown how two users were tested for their gender, and in Figure 7, it is shown if any assumptions can be made regarding their parenthood.
To be more precise, for each new user that enters our system, we could, after having recorded some of his/her movements, determine whether this user is a man or a woman or a parent, respectively. Our system could dynamically show relevant pages that pertain to that user only and not generic pages that we would show to an unknown user. We have defined in our code and trained our neural network when it calculates values less than 0.4 to assume that this user is male, and if it calculates values greater than 0.6 to assume that this user is female, and for values in between, not to make any prediction. Similarly, it works for the case of whether a user is a parent or not. It is implied that the closer to 0 or 1 the prediction is, the stronger it gets. That is, in the above example shown in Figure 6, the prediction for user “Sofia” resulted in 0.945, which is very close to 1, so this user is almost certainly a female. Similarly, for user “Akis”, the result from the neural network calculations showed 0.034, which is very close to 0, so this is also a strong and rather confident prediction. In case the outcome is about in the middle, as we can see in Figure 7, which shows that the parenthood of user “Akis” is 0.485, the system cannot make any predictions with the data recorded so far for him/her, based on his/her movements and choices. Since all this is done dynamically, in the future we could also predict whether this user is a parent or not when we have more data recorded for him/her.
Summarizing this chapter of this article, we could say that we presented the way of implementation and structure of our electronic application. We also showed how we tested it for the quality of its results by using condensed questionnaires. Our analysis of these showed that it does indeed work successfully and produces significant and accurate results for user profiling. Finally, we showed how neural networks could improve and automate the system’s user analysis processes. The users in our questionnaires showed that about 85% of them know that their movements and purchases in online stores are recorded, which means that now we all feel comfortable with it. In our question about whether they have concerns when they are shopping online, 56% said no and 44% said yes. If the risk of users’ privacy violation is reduced, then it is certain that more customers will be less concerned, and this is something we should all strive for in our future surveys [39].
Other scientists have shown in their research that user profiles can be used in e-commerce recommendation systems [9,40], for intelligent travel recommendation systems for individual and group users [41] and for inferring satisfaction in public information access services [42]. They could also be used in other platforms such as Recommendation System for E-Learning [43] and for the security of social networks [44], which have come to so dominate our lives [45,46]. User profiling could also be used to convert physical stores of smart cities into an open, geographically distributed mall by providing the logical consistency needed for conducting centralized searches over independent physical stores [47].

5. Conclusions and Future Work

In the near future, this profile may be expanded to include more features that will be used for personal purchases using new technologies such as mobile phones. After downloading the Store app on their mobile phone and entering the physical store, the user will continue to receive offer tips on the mobile screen according to their online profile [48].
As mobile’s GPS continually improves, in the near future the precise corridor that a customer is walking down may be calculable. This means that we could send him/her new targeted personal discounts as he/she walks through the store’s corridors, primarily based on the merchandise that is applicable to him and are close by and may focus on his profile. iBeacon technology can now be used. With the iBeacon network, any app or platform retailer can understand exactly where they are in a brick-and-mortar environment. The case will be like this. The consumer goes to the store with a smartphone. The application installed on the user’s smartphone listens to iBeacons. When an application hears iBeacon, it provides the server with the relevant information that triggers the operation. If the store knows the true online profile of a user and knows which aisle he/she is walking down in the store, they could promote targeted discounts on his/her mobile phone, exclusively for him/her, while he/she is standing next to a product. So, he/she would almost certainly buy that product and there would be an increase in product sales. Moreover, the user would “feel” like he/she won, since he/she bought a product that he/she likes and fits his/her profile at a better price without having to wander around the store to find it.
Furthermore, face recognition techniques could be applied to make these profiles more accurate in the future. All physical stores have cameras in their facilities nowadays. Through face recognition, the profiles that customers have in their online activities could be encased [49]. Gender, age and race are some of the characteristics of the online profiles that could easily be verified through the physical shopping of the consumers. By using the cameras in the physical stores, the technique proposed above could be implemented more easily and more economically, since it would not be necessary to use additional technologies such as iBeacons [50]. By making use of the cameras in the stores and discovering from the face who the particular consumer is who walks next to certain products and knowing his/her online profile, it could boost the sales rates of the stores. Targeted ads and offers individually for each consumer in real time using automated techniques would be available. User profiles could even be enriched with the stops consumers make next to specific products in the store, since even simple stops and checking out products would show their interests. So, we could in the future integrate the profiles from online stores with the profiles we have as consumers in physical stores [51]. This would mean almost absolute knowledge of shopping preferences, hence targeted results in our online searches and targeted individual advertisements and offers in the physical stores.
In this article, we have presented all the aspects of online user profiles. We showed what profiles are, how they are created and the benefits that consumers and citizens in general receive from them—e.g., what is the point of doing a search on a search engine to receive exactly the same results as another user? After all, you are probably not the same, you do not have the same interests, you are not the same gender, age, weight, height, etc. So, would it not be better and more constructive to receive information only relevant to us? Why should you receive millions of generic results when you are looking to buy a product and not a few hundred that targeted and fully relevant to you?
We also showed how a user profile could be created and how to “fix” a user profile through our virtual online store. We also investigated the success rates of our own engine’s predictions through the comparison of user questionnaires. We still saw that after knowing some things about our users, we could use neural networks to predict their next moves or wants [52]. Finally, we proposed a real time user-scenario when shopping in a physical store through real time targeted “personalized” advertisements as a customer walks through the corridors.

Author Contributions

Conceptualization, K.G.G., N.D.T. and I.D.M.; Investigation, K.G.G. and N.D.T.; Methodology, K.G.G., N.D.T. and I.D.M.; Software, K.G.G.; Validation, K.G.G.; Writing—original draft, K.G.G., N.D.T. and I.D.M.; Writing—review & editing, K.G.G., N.D.T. and I.D.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable. The study does not report any data.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CATSCollaborative Advisory Travel System
TEITechnological Educational Institute
GRSGroup recommender system
GDPRGeneral Data Protection Regulation
GPSGlobal Positioning System
IDIdentity document
PCAHTRSPersonalized Context-Aware Hybrid Travel Recommender System
PHPHypertext Preprocessor
SPSSStatistical Package for the Social Sciences
PHDDoctor of Philosophy

References

  1. Wagh, R.; Patil, J. Enhanced web personalization for improved browsing experience. Adv. Comput. Sci. Technol. 2017, 10, 1953–1968. [Google Scholar]
  2. Abri, S.; Abri, R.; Cetin, S. A classification on different aspects of user modelling in personalized web search. In Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, Seoul, Korea, 18–20 December 2020; pp. 194–199. [Google Scholar] [CrossRef]
  3. Bakaev, M.A.; Pogorelova, A.O. Profiling of Website Visitors Based on Dimensions of User Experience. In Proceedings of the 2021 XV International Scientific-Technical Conference on Actual Problems Of Electronic Instrument Engineering (APEIE), Berdsk, Russia, 19–21 November 2021; pp. 1–6. [Google Scholar] [CrossRef]
  4. Kanoje, S.; Girase, S.; Mukhopadhyay, D. User Profiling for Recommendation System. arXiv 2015, arXiv:1503.06555. [Google Scholar]
  5. Kanoje, S.; Girase, S.; Mukhopadhyay, D. User profiling trends, techniques and applications. arXiv 2015, arXiv:1503.07474. [Google Scholar]
  6. Li, J.; Zhang, X.; Wang, K.; Zheng, C.; Tong, S.; Eynard, B. A personalized requirement identifying model for design improvement based on user profiling. AI EDAM 2020, 34, 55–67. [Google Scholar] [CrossRef]
  7. User Profile. Wikipedia, the Free Encyclopedia. 2022. Available online: https://en.wikipedia.org/wiki/User_profile (accessed on 5 April 2022).
  8. Farid, M.; Elgohary, R.; Moawad, I.; Roushdy, M. User Profiling Approaches, Modeling, and Personalization. In Proceedings of the 11th International Conference on Informatics & Systems (INFOS 2018), Cairo, Egypt, 10–12 December 2018; Available online: https://ssrn.com/abstract=3389811 (accessed on 5 April 2022).
  9. Gu, Y.; Ding, Z.; Wang, S.; Yin, D. Hierarchical user profiling for e-commerce recommender systems. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 223–231. [Google Scholar] [CrossRef] [Green Version]
  10. Hawashin, B.; Lafi, M.; Kanan, T.; Mansour, A. An efficient hybrid similarity measure based on user interests for recommender systems. Expert Syst. 2020, 37, e12471. [Google Scholar] [CrossRef]
  11. Zhao, S.; Li, S.; Ramos, J.; Luo, Z.; Jiang, Z.; Dey, A.K.; Pan, G. User profiling from their use of smartphone applications: A survey. Pervasive Mob. Comput. 2019, 59, 101052. [Google Scholar] [CrossRef]
  12. Kyriazanos, D.M.; Olesen, H.; Hammershøj, A.D.; Heinze, E.K.S.; Bessler, S.; Zeiss, J.; Patrikakis, C.Z.; Nikolakopoulos, G.; Amundsen, S.; Thuvesson, H.; et al. Specification of User Profile, Identity and Role Management for PNs and Integration to the PN Platform; IST Project MAGNET Beyond (My Personal Adaptive Global Net and Beyond) No. Deliverable D4.3.2 (D1.2.2) IST-027396; Aalborg Universitetsforlag: Aalborg, Denmark, 2007. [Google Scholar]
  13. Olesen, H.; Noll, J.; Hoffmann, M.; Hammershøj, A.; Sapuppo, A.; Iqbal, Z.; Elahi, N.; Chowdhury, M.; Heikkinen, S.; Sutterer, M.; et al. User Profiles, Personalization and Privacy: WWRF Outlook Series. 2009; pp. 22–23. Available online: https://vbn.aau.dk/en/publications/user-profiles-personalization-and-privacy-wwrf-outlook (accessed on 5 April 2022).
  14. Wachter, S. Normative challenges of identification in the Internet of Things: Privacy, profiling, discrimination, and the GDPR. Comput. Law Secur. Rev. 2018, 34, 436–449. [Google Scholar] [CrossRef]
  15. Johnson, A.; Taatgen, N. User Modeling. In Handbook of Human Factors in Web Design; Lawrence Erlbaum Associates: Mahwah, NJ, USA, 2005; pp. 424–439. ISBN 9780805846119. [Google Scholar]
  16. Dhelim, S.; Aung, N.; Ning, H. Mining user interest based on personality-aware hybrid filtering in social networks. Knowl.-Based Syst. 2020, 206, 106227. [Google Scholar] [CrossRef]
  17. Xing, L.; Song, Z.; Ma, Q. User interest model based on hybrid behaviors interest rate. Appl. Res. Comput. 2016, 3, 661–664. [Google Scholar]
  18. Peng, J.; Choo, K.K.R.; Ashman, H. User profiling in intrusion detection: A review. J. Netw. Comput. Appl. 2016, 72, 14–27. [Google Scholar] [CrossRef]
  19. O’Neil, C.; Schutt, R. Statistical Inference, Exploratory Data Analysis, and the Data Science Process, Doing Data Science. In Doing Data Science; O’Reilly Media, Inc.: Sevastopol, CA, USA, 2014; Chapter 2; pp. 17–50. ISBN 9781449358655. [Google Scholar]
  20. Adèr, H.J. Phases and initial steps in data analysis. In Advising on Research Methods: A Consultant’s Companion; Adèr, H.J., Mellenbergh, G.J., Hand, D.J., Eds.; Johannes van Kessel Pub.: Huizen, The Netherlands, 2008; Chapter 14; pp. 333–356. ISBN 9789079418015. [Google Scholar]
  21. User Modeling. Wikipedia, the Free Encyclopedia. 2022. Available online: https://en.wikipedia.org/wiki/User_modeling (accessed on 5 April 2022).
  22. Fischer, G. User Modeling in Human–Computer Interaction. User Modeling User-Adapt. Interact. 2001, 11, 65–86. [Google Scholar] [CrossRef]
  23. Kostolányová, K.; Klubal, L. Use of user modeling for personalization. In AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2018; Volume 1978, p. 060017. [Google Scholar] [CrossRef]
  24. Brusilovsky, P. Adaptive hypermedia. User Modeling User-Adapt. Interact. 2001, 11, 95–97. [Google Scholar] [CrossRef]
  25. Abu-Naser, S.S.; Alamawi, W.W.; Alfarra, M.F. Rule Based System for Diagnosing Wireless Connection Problems Using SL5 Object. 2016. Available online: https://philpapers.org/go.pl?aid=ABURBS (accessed on 5 April 2022).
  26. Ricci, F.; Rokach, L.; Shapira, B. Introduction to Recommender Systems Handbook. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Kantor, P., Eds.; Springer: Boston, MA, USA, 2011. [Google Scholar] [CrossRef]
  27. Nielsen, J. Usability Engineering; Academic Press Inc.: Cambridge, MA, USA, 1994; p. 165. ISBN 978-0-12-518406-9. [Google Scholar]
  28. Clifton, C. Encyclopedia Britannica: Definition of Data Mining. Available online: http://www.britannica.com/technology/data-mining (accessed on 5 April 2022).
  29. Ghorbani, A.; Zhang, J. GUMSAWS: A Generic User Modeling Server for Adaptive Web Systems. In Proceedings of the Fifth Annual Conference on Communication Networks and Services Research (CNSR ‘07), Frederlcton, NB, USA, 14–17 May 2007; pp. 117–124. [Google Scholar] [CrossRef]
  30. McCarthy, K.; Salamó, M.; Coyle, L.; McGinty, L.; Smyth, B.; Nixon, P. Cats: A synchronous approach to collaborative group recommendation. In Proceedings of the Florida Artificial Intelligence Research Society Conference (FLAIRS), Melbourne Beach, FL, USA, 11–13 May 2006; pp. 86–91. Available online: https://www.aaai.org/Papers/FLAIRS/2006/Flairs06-015.pdf (accessed on 5 April 2022).
  31. Logesh, R.; Subramaniyaswamy, V. Exploring hybrid recommender systems for personalized travel applications. In Cognitive Informatics and Soft Computing; Springer: Singapore, 2019; pp. 535–544. [Google Scholar] [CrossRef]
  32. Álvarez Márquez, J.O.; Ziegler, J. Hootle+: A group recommender system supporting preference negotiation. In Cyted-Ritos International Workshop on Groupware; Springer: Cham, Switzerland, 2016; pp. 151–166. [Google Scholar] [CrossRef]
  33. To Login or to Social Login. Available online: https://www.linkedin.com/pulse/login-social-scout-stevenson (accessed on 5 April 2022).
  34. Directive 95/46/EC. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A31995L0046 (accessed on 5 April 2022).
  35. Lopes, H.; Pires, I.M.; Sánchez San Blas, H.; García-Ovejero, R.; Leithardt, V. PriADA: Management and Adaptation of Information Based on Data Privacy in Public Environments. Computers 2020, 9, 77. [Google Scholar] [CrossRef]
  36. Trends in Online Shopping. A Global Nielsen Consumer Report. June 2010. Available online: https://www.nielsen.com/wp-content/uploads/sites/3/2019/04/Q1-2010-GOS-Online-Shopping-Trends-June-2010.pdf (accessed on 5 April 2022).
  37. Neural Network. Available online: https://en.wikipedia.org/wiki/Neural_network (accessed on 5 April 2022).
  38. Neural Networks. Chapter 10. Available online: https://natureofcode.com/book/chapter-10-neural-networks (accessed on 5 April 2022).
  39. Gatziolis, K.G.; Boucouvalas, A.C. Discovering the impact of user profiling in e-services. In Proceedings of the 2014 International Conference on Telecommunications and Multimedia (TEMU), Heraklion, Greece, 28–30 July 2014. [Google Scholar] [CrossRef]
  40. Chaudhuri, A.; Samanta, D.; Sarma, M. Modeling user behaviour in research paper recommendation system. arXiv 2021, arXiv:2107.07831. [Google Scholar] [CrossRef]
  41. Logesh, R.; Subramaniyaswamy, V.; Vijayakumar, V.; Li, X. Efficient user profiling based intelligent travel recommender system for individual and group of users. Mob. Netw. Appl. 2019, 24, 1018–1033. [Google Scholar] [CrossRef]
  42. Flores, A.M.; Pavan, M.C.; Paraboni, I. User profiling and satisfaction inference in public information access services. J. Intell. Inf. Syst. 2022, 58, 67–89. [Google Scholar] [CrossRef]
  43. Kulkarni, T.; Kabra, M.; Shankarmani, R. User Profiling Based Recommendation System for E-Learning. In Proceedings of the 2019 IEEE 16th India Council International Conference (Indicon), Rajkot, India, 13–15 December 2019; pp. 1–4. [Google Scholar] [CrossRef]
  44. EEke, C.I.; Norman, A.A.; Shuib, L.; Nweke, H.F. A survey of user profiling: State-of-the-art, challenges, and solutions. IEEE Access 2019, 7, 144907–144924. [Google Scholar] [CrossRef]
  45. Mamun, M.; Al-Digeil, M.; Ahmed, S.S. Profiling Online Users: Emerging Approaches and Challenges. In Securing Social Networks in Cyberspace; CRC Press: Boca Raton, FL, USA, 2021; pp. 221–240. ISBN 9781003134527. [Google Scholar]
  46. Utami, E.; Mihuandayani, M.; Raharjo, S.; Hartanto, A.D.; Adi, S. A Review on Social Media Based Profiling Analysis. In Proceedings of the 2020 International Seminar on Application for Technology of Information and Communication (iSemantic), Semarang, Indonesia, 19–20 September 2020; pp. 442–448. [Google Scholar] [CrossRef]
  47. Bourg, L.; Chatzidimitris, T.; Chatzigiannakis, I.; Gavalas, D.; Giannakopoulou, K.; Kasapakis, V.; Konstantopoulos, C.; Kypriadis, D.; Pantziou, G.; Zaroliagis, C. Enhancing shopping experiences in smart retailing. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 1–19. [Google Scholar] [CrossRef] [PubMed]
  48. Aivalis, C.J.; Gatziolis, K.G.; Boucouvalas, A.C. Innovations in E-Systems for E-Commerce. In Innovations in E-Systems for Business and Commerce; Apple Academic Press: Palm Bay, FL, USA, 2017; pp. 301–337. ISBN 9781771885645. [Google Scholar]
  49. Jung, S.G.; An, J.; Kwak, H.; Salminen, J.; Jansen, B.J. Assessing the accuracy of four popular face recognition tools for inferring gender, age, and race. In Proceedings of the Twelfth international AAAI conference on Web and Social Media, Palo Alto, CA, USA, 25–28 June 2018. [Google Scholar]
  50. Boucouvalas, A.C.; Aivalis, C.J.; Gatziolis, K.G. Integrating retail and e-commerce using Web Analytics and intelligent sensors. In Proceedings of the International Conference on E-Business and Telecommunications, Seoul, Korea, 3–5 August 2015; Springer: Cham, Switzerland, 2015. [Google Scholar] [CrossRef]
  51. Aivalis, C.J.; Gatziolis, K.G.; Boucouvalas, A.C. Evolving analytics for e-commerce applications: Utilizing big data and social media extensions. In Proceedings of the 2016 International Conference on Telecommunications and Multimedia (TEMU), Heraklion, Greece, 25–27 July 2016. [Google Scholar] [CrossRef]
  52. Chen, Y.; He, J.; Wei, W.; Zhu, N.; Yu, C. A Multi-Model Approach for User Portrait. Future Internet 2021, 13, 147. [Google Scholar] [CrossRef]
Figure 1. Recommendation system.
Figure 1. Recommendation system.
Futureinternet 14 00144 g001
Figure 2. Clustering.
Figure 2. Clustering.
Futureinternet 14 00144 g002
Figure 3. Pattern Mining Available online: https://borgelt.net/teach/fpm/ (accessed on 5 April 2022).
Figure 3. Pattern Mining Available online: https://borgelt.net/teach/fpm/ (accessed on 5 April 2022).
Futureinternet 14 00144 g003
Figure 4. Database tables.
Figure 4. Database tables.
Futureinternet 14 00144 g004
Figure 5. Data analysis and display technique of a user.
Figure 5. Data analysis and display technique of a user.
Futureinternet 14 00144 g005
Figure 6. Prediction for a user. This user is a female and a Parent.
Figure 6. Prediction for a user. This user is a female and a Parent.
Futureinternet 14 00144 g006
Figure 7. Prediction for a user. This user is a Male and a no prediction can be made for his parenthood, so probably he is not a parent.
Figure 7. Prediction for a user. This user is a Male and a no prediction can be made for his parenthood, so probably he is not a parent.
Futureinternet 14 00144 g007
Table 1. A comparative table of user profile types, in relation to the researched literature.
Table 1. A comparative table of user profile types, in relation to the researched literature.
User Profile TypeDescriptionAdvantagesDisadvantages
Explicit user profileDirect user interaction with the system.
Users manually create and fill in main data.
Data are collected quickly.
Data gathered are of high quality.
Usually, users enter real information when they enroll.
Users have full control over the information collected.
Users decide what they want to share with the system.
Users may not want to provide much data.
It lacks the ability to adapt to changes and user preferences.
It is highly dependent on the user’s willingness to provide the information.
Users may not write true information on the forms.
Users who are willing to provide true information may not know how to express their interests.
Implicit user profileThe system learns dynamically from observing user interactions.User’s information can be easily and automatically updated so that the system is always aware and more accurate about their preferences.
Minimal user effort is required.
It takes more time to gather valuable information about users.
If there is no repetition in the user’s actions the pattern cannot be discovered.
The information cannot be changed or seen by the users.
Hybrid user profileCombine the previous methods and adjust the user’s profile according to their preferences.Advantages of both techniques.Disadvantages of both techniques.
Table 2. Gender analysis.
Table 2. Gender analysis.
GenderReal Data from
Questionnaires
Profiling System Accurate PredictionsSuccess RATE
Male574782%
Female433786%
Total1008484%
Table 3. Parenthood analysis.
Table 3. Parenthood analysis.
ParentReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
Yes321237.5%
No683754.4%
Total1004949%
Table 4. Running activity.
Table 4. Running activity.
RunningReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No786381%
Yes221150%
Total1007474% 1
1 The success rate of our system for the running activity is 74%.
Table 5. Football activity.
Table 5. Football activity.
FootballReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success RATE
No766586%
Yes24833%
Total1007373% 1
1 The success rate of our system for the football activity is 73%.
Table 6. Basketball activity.
Table 6. Basketball activity.
BasketballReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No827288%
Yes18950%
Total1008181% 1
1 The success rate of our system for the basketball activity is 81%.
Table 7. Gymnastics activity.
Table 7. Gymnastics activity.
GymnasticsReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No826174%
Yes18739%
Total1006868% 1
1 The success rate of our system for the gymnastics activity is 68%.
Table 8. Tennis activity.
Table 8. Tennis activity.
TennisReal data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No867688%
Yes14750%
Total1008383% 1
1 The success rate of our system for the tennis activity is 83%.
Table 9. Hiking activity.
Table 9. Hiking activity.
HikingReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No806784%
Yes20420%
Total1007171% 1
1 The success rate of our system for the hiking activity is 71%.
Table 10. Swimming activity.
Table 10. Swimming activity.
SwimmingReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No796481%
Yes21943%
Total1007373% 1
1 The success rate of our system for the swimming activity is 73%.
Table 11. Cycling activity.
Table 11. Cycling activity.
CyclingReal Data from
Questionnaires
Profiling System
Accurate Predictions
Success Rate
No826478%
Yes18739%
Total1007171% 1
1 The success rate of our system for the cycling activity is 71%.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gatziolis, K.G.; Tselikas, N.D.; Moscholios, I.D. Adaptive User Profiling in E-Commerce and Administration of Public Services. Future Internet 2022, 14, 144. https://doi.org/10.3390/fi14050144

AMA Style

Gatziolis KG, Tselikas ND, Moscholios ID. Adaptive User Profiling in E-Commerce and Administration of Public Services. Future Internet. 2022; 14(5):144. https://doi.org/10.3390/fi14050144

Chicago/Turabian Style

Gatziolis, Kleanthis G., Nikolaos D. Tselikas, and Ioannis D. Moscholios. 2022. "Adaptive User Profiling in E-Commerce and Administration of Public Services" Future Internet 14, no. 5: 144. https://doi.org/10.3390/fi14050144

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop