GoSS-Rec: Group-Oriented Segment Sequence Recommendation

Aguirre, Marco; Recalde, Lorena; Loza-Aguirre, Edison

doi:10.3390/info16080668

Open AccessArticle

GoSS-Rec: Group-Oriented Segment Sequence Recommendation

by

Marco Aguirre

^1,*

,

Lorena Recalde

¹

and

Edison Loza-Aguirre

²

¹

Department of Informatics and Computer Science, Escuela Politécnica Nacional, Quito 170525, Ecuador

²

Colegio de Administración de Empresas, Universidad San Francisco de Quito, Quito 170901, Ecuador

^*

Author to whom correspondence should be addressed.

Information 2025, 16(8), 668; https://doi.org/10.3390/info16080668

Submission received: 2 June 2025 / Revised: 5 July 2025 / Accepted: 8 July 2025 / Published: 6 August 2025

Download

Browse Figures

Versions Notes

Abstract

In recent years, the advancement of various applications, data mining, technologies, and socio-technical systems has led to the development of interactive platforms that enhance user experiences through personalization. In the sports domain, users can access training plans, routes and healthy habits, all in a personalized way thanks to sports recommender systems. These recommendation engines are fueled by rich datasets that are collected through continuous monitoring of users’ activities. However, their potential to address user profiling is limited to single users and not to the dynamics of groups of sportsmen. This paper introduces GoSS-Rec, a Group-oriented Segment Sequence Recommender System, which is designed for groups of cyclists who participate in fitness activities. The system analyzes collective preferences and activity records to provide personalized route recommendations that encourage exploration of diverse cycling paths and also enhance group activities. Our experiments show that GoSS-Rec, which is based on Prod2vec, consistently outperforms other models on diversity and novelty, regardless of the group size. This indicates the potential of our model to provide unique and customized suggestions, making GoSS-Rec a remarkable innovation in the field of sports recommender systems. It also expands the possibilities of personalized experiences beyond traditional areas.

Keywords:

data mining; group recommender systems; Prod2vec; recommender systems; sequence-aware recommender systems; sports

Graphical Abstract

1. Introduction

Today’s technological era has seen the emergence of numerous apps, technologies, and interactive social networks aimed at enhancing fitness and sports experiences at all levels. Platforms like Strava, Garmin Connect, Endomondo, Fitocracy, Runtastic, Map My Ride, My Fitness Pal, and Zwift have gained significant popularity, reflecting the growing interest in fitness and the use of cutting-edge technology to boost training and social engagement [1].

Users often find motivation, support, and a sense of community by connecting with like-minded individuals on these platforms [2]. The convergence of fitness and sports platforms with digital technology, often referred to as Online Social Fitness Networks (OSFNs), not only enables the monitoring and measurement of physical activity but also fosters a lively community for users to connect and engage. Therefore, by offering personalized experiences, these tools have become valuable assets for individuals striving to lead healthier and more active lives [3].

OSFNs integration has given rise to sports recommendation applications as powerful tools for monitoring and analyzing various physical activities, including running, cycling, swimming, and more. As a result, there is a wide range of “recommender”-style apps available to athletes, including those that suggest nutritious foods [4], electronic sports playgrounds [5] training plans [6], routes for cyclists by estimating their heart rate profiles [7], and many more. In essence, these systems are highly effective at personalizing fitness experiences for users by offering tailored training activities, nutrition plans, rehabilitation routines, and product promotions.

Despite the widespread use of OSFNs, their vast datasets remain underutilized in the development of sophisticated sports recommender systems (RSs), leaving significant opportunities unexplored. One such opportunity lies in group cycling, where existing RSs often fail to adequately consider the collective preferences and activity records of group members, leading to suboptimal and generic suggestions.

This research addresses the challenge of providing route recommendations for groups of cyclists engaged in fitness activities. Our proposal, GoSS-Rec, is a novel sequence-aware, group-oriented recommender system specifically designed to provide personalized suggestions that align with the preferences and dynamics of a group of cyclists. We evaluate our approach against multiple state-of-the-art recommendation algorithms using standard ranking metrics such as Normalized Discounted Cumulative Gain (NDCG), diversity, and novelty, demonstrating its effectiveness on a real-world dataset of groups of cyclists. Through this innovative design, GoSS-Rec not only encourages the discovery of new paths but also enriches the overall experience by fostering both the social and exploratory aspects of group cycling.

The remainder of this paper is structured as follows. Section 2 presents an overview of related work in the field of sports recommender systems. Section 3 introduces the formulation of the route recommendation problem and the proposed approach. Section 4 presents the dataset and experimental setup. Section 5 presents the tests and results of the assessed group recommender systems (GRS). Finally, conclusions and future lines of work are presented in Section 6.

2. Related Works

While recommender systems (RSs) have undergone extensive development in various domains, their integration into the sports field, particularly in group training, is still evolving. In fact, sports recommender systems are receiving increasing attention due to their potential to foster healthy living, improve personal well-being, and enhance sports performance. This section presents the foundations for understanding the evolution and current status of RSs, with a particular focus on their application in sports, especially those where group activities are central.

Sports Recommender Systems

As societies age and awareness of the importance of maintaining an active lifestyle grows, sports RSs have become essential tools to inspire and motivate people, helping them discover new opportunities to improve their health and enhance their physical capabilities. From promoting healthy habits to enhancing sports participation, sports-related RSs serve a variety of user needs. Their applications are diverse and range from recommending nutritious food [4] and personalized training plans [6] to suggesting innovative options such as adventure playgrounds in e-sports [5]. Thus, this demonstrates their versatility and value in improving the quality of sports experiences. Actually, these systems aim to support athletes, coaches, and users in general by helping them improve and optimize their sports capabilities and experiences [8,9].

Given the ability of these systems to provide personalized and data-driven recommendations, numerous research efforts have explored specific techniques. For instance, Boratto et al. [10] applied collaborative filtering to analyze changes in user behavior and predict whether users would abandon their training plans. A related approach is presented in Santos-Gago et al.’s work [11], which proposes a similar constraint-based recommendation approach that considers the athletes’ current physical condition and its influence on physical activities within the context of suggested training plans. In this line of research, Roanes-Lozano et al. [12] introduce a constraint-based recommendation approach aimed at improving tennis serving techniques.

Sports recommendations that are focused on routes and destinations are featured in various types of sports-related applications. For example, Avesani et al. [13] introduce a collaborative filtering-based approach to the recommendation of ski routes. Another related example is the FitRec system [7], which employs an LSTM-based model to capture personalized and temporal patterns of fitness data, enabling workout route recommendations and short-term heart rate predictions. Following the concepts of GRS, McCarthy et al. [14] propose a method for recommending ski resorts based on critiquing. This proposal is designed for groups of friends who need to decide which resort to visit. Similarly, Wirz et al. [15] present an approach to capture and aggregate individual paragliding behaviors to obtain a collective or group behavior that is then used to recommend thermal hot-spots that favor better paragliding experiences.

However, it is worth noting that most research focuses on individual recommendations, with limited exploration of group activities in sports such as cycling. Addressing this gap, the introduction of GoSS-Rec represents a significant advancement in applying GRS to sports. Unlike existing models, GoSS-Rec integrates sequence-aware algorithms to accommodate the dynamic nature of group preferences in real-time cycling contexts. By considering interactions within the group, this system enhances the overall group experience, dynamically adjusting recommendations to reflect evolving group dynamics. This adaptability marks a notable innovation, enabling the system to provide more precise and satisfactory recommendations aligned with both current and anticipated group preferences.

3. Proposed Approach

To address the challenge of personalized route recommendations for groups of cyclists, we propose GoSS-Rec. This is a system designed to analyze collective preferences and activity history to generate recommendations that encourage the exploration of diverse cycling routes while enhancing group dynamics. In this section, we begin by presenting key preliminary definitions and detail the problem statement. Next, we describe different variants of the sequence model, which represent sequentially ordered user–item interaction records within the recommendation process. Finally, in the following sections, we detail each component of our proposed model.

3.1. System Design and Architecture

Figure 1 details the GoSS-Rec architecture and its integration with the Strava Cycling API. Cyclists record their activities on Strava using devices such as smartwatches, cycling computers, gadgets, or smartphones. These devices capture detailed ride data such as distance, speed, elevation, and GPS coordinates. This data is then synced and stored in the Strava app. GoSS-Rec is designed to consume data from the Strava API. After connecting to the API, the GoSS-Rec architecture allows it to manage and process both historical and real-time cycling data, offering dynamic recommendations tailored to the group’s preferences.

(1)

Data Extraction. This component is responsible for gathering historical and live cycling data from the Strava API.

(1.1): Activity Capture. Strava uses devices such as GPS-enabled cycling computers, smartphones, and smartwatches to track users’ rides, which generates activity records on the Strava platform.
(1.2): Data Extraction. This component employs the Strava API v3 to access detailed activity records, segments, athlete profiles, and club memberships, ensuring comprehensive data retrieval.

(2)

Processing and Load. This phase utilizes advanced analytics to discern patterns and preferences from the collected data, which informs sequence predictions.

(2.1): Cleaning and Restructuring. The raw data is cleaned to remove any anomalies and restructured into a suitable format that serves as input to the training models.
(2.2): Grouping. Cyclists are grouped based on their Strava club membership, allowing the system to identify common preferences within each group.
(2.3): Sequence. The sequences of past rides are analyzed to determine popular routes and segments frequently chosen by the group.

(3)

GoSS-Rec. The recommender system implements the GoSS-Rec algorithm, and dynamically generates route suggestions that resonate with the group’s historical and current activity data.

(3.1): Group Sequences. The model uses the group’s past ride sequences to predict future preferences.
(3.2): Segment Recommendation. Based on the analyzed sequences, the engine recommends specific route segments.
(3.3): Route. Cyclists can view recommended routes on a map interface, including GPS-tracked routes.

Figure 1 also illustrates the flow from individual user data through the processing stages to the final group-oriented route recommendation.

3.2. Preliminaries and Problem Statement

In this section, we present the definition and problem statement which are essential for addressing the challenge of group route recommendation. Specifically, we formally define the problem addressed by the Group-oriented Segment Sequence (GoSS-Rec) algorithm.

Definition 1.

Group of users. It is denoted as

g_{k}

. Given a set of users

U = \{u_{1}, u_{2}, u_{3} \dots, u_{m}\}

, each

g_{k} \subseteq U

and k satisfies the condition

k \geq 2

which represents the group size defined by the number of members in the group. All users in

g_{k} = \{u_{1}, u_{2}, u_{3} \dots, u_{k}\}

receive the same recommendations.

Definition 2.

Segment. It is denoted as s and is defined as a uniquely identified location that represents a specific part of a route used by cyclists. A sequence of segments represents a complete route denoted as P. In other words, P is an ordered set of segments defined as

P = \{s_{1}, s_{2}, s_{3} \dots, s_{p}\}

that the user travels sequentially.

Definition 3.

Riding Segment. The data about a user traversing a segment is represented as

s_{u}^{t} = (s, t, u)

, which means that a user u rides segment s at timestamp t.

Definition 4.

Route activity. A sequence of segments visited by a given user is a set of ordered activities or route visited by a given user u. It is represented by

P_{u} = \{s_{t_{1}}^{u}, s_{t_{2}}^{u}, s_{t_{3}}^{u}, \dots s_{t_{n}}^{u}\}

, with

t_{1} < t_{n}

. The historical routes of all users that conform the dataset employed in this study are represented as

D = {P 1_{u 1}, P 2_{u 1}, \dots,

P x_{u 1}, P 1_{u 2}, P 2_{u 2}, \dots, P x_{u 2}, \dots, P 1_{u m}, P 2_{u m}, \dots, P x_{u m}}

, where m signifies the total count of users.

Definition 5.

User segment Interests. Given the set of segments, denoted as

S = \{s_{1}, s_{2}, s_{3}, \dots, s_{p}\}

, each segment is characterized by attributes of interest observed in the segment dataset, such as

s_{1} = (s t a r_c o u n t, e f f o r t_c o u n t)

, where:

star_count: The count of stars or ratings given to this segment by users, which often indicates its popularity or quality.
effort_count: The total number of efforts made on this segment by all athletes.

Definition 6.

Group preferences are defined based on the set of group routes

D (g_{k})

. The average number of segments per route for group

n (g_{k})

, is shown in Equation (1).

n_{g_{k}} = \frac{1}{|D (g_{k})|} \sum_{i = 1}^{k} |P u_{i}|

(1)

Equation (1) shows a representation of the average segments in group routes, where

|D (g_{k})|

represents the total number of routes in the group, and

|P u_{i}|

denotes the number of segments in each route.

Problem Statement: Given the information of all Strava user routes P and group preferences

n (g_{k})

, the output of our proposed model is to recommend the sequence of segments to the group, defined as

R (g_{k}) = \{s_{1}, s_{2}, s_{3}, \dots, s_{n}\}

, where that sequence forms the route

R (g_{k})

to be performed by the group

g_{k}

. The overall problem is illustrated in Figure 2, and the notation is summarized in Table 1.

3.3. Group-Oriented Segment Sequence Algorithm

The Group-oriented Segment Sequence (GoSS-Rec) algorithm is designed for recommending a sequence of segments to a user group, considering historical routes (segment sequences) and a pre-trained recommendation model. Algorithm 1, Group-oriented Segment Sequence (GoSS-Rec), starts by defining input parameters, such as the historical route sequences for the user group

g_{k}

and the recommendation model

M o d e l

. The algorithm initializes key variables and calculates an initial recommendation

R s_{0}

, which corresponds to the first segment to visit. Then, it iterates through the segments, generating recommendations and forming a cycling route path

C p

by joining the segments. The recommended final sequence of

R (g_{k})

segments is returned, along with the final

C p

path. This algorithm offers a formal and structured approach to leverage historical data and a recommendation model to provide a tailored sequence of segments for the user group.

Algorithm 1 Group-oriented Segment Sequence Recommendation (GoSS-Rec)

Require:

V_{U}

: Historical route sequences of all users.
Require:

g_{k}

: Group of users to recommend.
Require:

M o d e l \{n, v_{u}^{t 1}\}

: pre-train model sequence recommendation.
1: Define n, in base historical

g_{k}

users.
2: Initialize

R s = \{s e l e c t e d_s e g m e n t s\}

sequence of segments for

g_{k}

.
3: Initialize

C p = \{f i n a l r o u t e r e c o m m e n d a t i o n\}

.
4:

R s_{0} = s e g m e n t s_e n d_i n d e x = v_{u}^{t 1} \leftarrow

the end_index of the first segment.
5:

R s_{n} = M o d e l \{n, R s_{0}\}

generate sequence recommendation.
6: for i from 1 to

n - 1

do
7: Append segment

R s_{i}

to

C p

8: Append segment Join segment

R s_{i}

to

R s_{i + 1}

to

C p

9: end for
10: Return

R s_{n}

as the recommended sequence of segments for group

g_{k}

, and

C p

is cycling route path.

3.4. Sequence Model Recommendation

PopRec. It is the simplest baseline that ranks items according to their popularity judged by the number of interactions (i.e., number of associated actions, views, purchases, or interactions that the items have received).
MDP (Markov Decision Processes). MDP-based RSs can adapt to changing user preferences and evolving item catalogs by continuously updating their policies based on user interactions [16].
FPMC (Factorized Personalized Markov Chains). It is a hybrid RS that combines Matrix Factorization with first-order Markov. FPMC captures long-term preferences and dynamic transitions, respectively [17].
Prod2vec. Recommendations are created by returning the k-nearest neighbors of the last items in the user profile, whose relevance is weighted using a simple exponential decay. That is, the last item in the user profile is the most relevant, and the first item is the least relevant [18].
KNN (K-Nearest Neighbors). The method considers the last item of a given session and then returns, as recommendations, those items that most resemble it in terms of their co-occurrence in other sessions [18].
SASREC (Self-Attentive Sequential Recommendation). It is a sequential next-item recommendation method based on the left-to-right transformer architecture. This strategy employs the multi-head self-attention mechanism to capture users’ sequential behaviors and interactions [19].

3.5. Evaluation Metrics

To assess the performance of the proposed GoSS-Rec system, we adopt a set of well-established evaluation metrics commonly used in the recommender systems literature. These metrics aim to capture different aspects of recommendation quality, including accuracy, novelty, and diversity. Given that the system has not yet been deployed in a live environment, all evaluations are performed in an offline setting by using the historical or pre-existing interaction data. Offline evaluation is appropriate for our case, as it allows to simulate user behavior and systematically compare the effectiveness of different models before real-world deployment [20].

This preliminary evaluation provides several advantages. In fact, it facilitates algorithm tuning, correction of potential biases, optimization for utility-enhancing factors such as novelty and diversity, and comparative testing. Moreover, without the need for real-time interaction, it reduces complexity and associated costs.

3.5.1. Sequence Metric

To measure the ranking quality of the system in a sequential recommendation context, we adopt Normalized Discounted Cumulative Gain (NDCG) at rank K (NDCG@K). This metric is widely recognized for its ability to account for both the relevance and the position of recommended items, thus favoring correct recommendations in higher ranks [21]. Formally, the NDCG@K is computed as shown in Equation (2). It is assumed a decreasing user interest with a logarithmic curve, paying more attention to possible relevant items in the middle of the list.

NDCG @ K = \frac{DCG @ K}{IDCG @ K}

(2)

where Discounted Cumulative Gain (DCG@K, Equation (3)) accomplishes this by discounting the reward for items in lower ranks:

DCG @ K = \sum_{i \in {i ∣ {rank}_{u} (i) \leq K}} \frac{y_{u, i}}{{log}_{2} ({rank}_{u} (i) + 1)}

(3)

y_{u, i}

denotes the relevance score for user u and item i, typically modeled as a binary label. The Ideal DCG or IDCG@K corresponds to the best possible ordering of items; that is, the discounted cumulative gain that would have been achieved via an optimal ranking function

{r a n k_{u}}^{o p t u} (i)

. This metric is particularly well-suited for sequential recommendation tasks where the order of presented segments affects user satisfaction [22].

3.5.2. Novelty and Diversity

Beyond ranking accuracy, we consider novelty and diversity, two dimensions that contribute to long-term engagement and user satisfaction in recommender systems [23]. Novelty can be broadly defined as the distinction between current and previous experiences, while diversity refers to the internal variations within components of an experience. Introducing novelty and diversity as key elements in achieving the desired outcome implies adopting a broader viewpoint in addressing the recommendation problem. This perspective focuses on the ultimate utility of recommendations, rather than solely emphasizing a single aspect such as accuracy.

Novelty. It is typically defined as the complement of the popularity of an item, and in the recommendation it expresses items that the user did not know. A variant is to define novelty as

- l o g p (i)

, which gives the self-information of an item i [24]. In comparison to a simple complement of popularity, this approach places more importance on very rare or unfamiliar items. In this context, as presented in Equation (4), the novelty of a recommendation list is given by the aggregation of individual novelties of each item in the list [25].

Novelty (R) = \frac{\sum_{i \in R} - {log}_{2} p (i)}{| R |}

(4)

where

p (i) = \frac{|\{u \in U, r_{u, i} \neq •\}|}{| U |} \cdot | U |

is the number of users who have voted for item i. This formulation rewards the recommendation of less popular segments, thus promoting route exploration in the context of sport applications.

Diversity. It is defined as the opposite of similarity. In other words, diversity captures the heterogeneity of the recommended items, measured as the average pairwise dissimilarity within the recommendation list. The frequently considered diversity metric and the first to be proposed in the area is the so-called average intra-list distance—or intra-list diversity (ILD) [26]. Equation (5) suggests measuring the diversity of a recommendation list

R (| R | > 1)

as the average distance between pairs of items in the list.

ILD = \frac{1}{| R | (| R | - 1)} \sum_{i \in R} \sum_{j \in R} d (i, j)

(5)

The function d(i, j) denotes the dissimilarity between items i and j, which can be instantiated using cosine distance, Jaccard dissimilarity, or feature-based distance measures, depending on the available item data. In the context of GoSS-Rec, diversity encourages the recommendation of distinct route segments, thus avoiding repetitive or overly similar sequences for cycling groups.

4. Experiments

This section describes the datasets, experimental setup, and configuration used for the model’s assessment. The evaluation was conducted using real-world cycling activity data from Strava. Moreover, we focused on route and group characteristics within the metropolitan district of Quito, Ecuador.

4.1. Dataset

We employed multiple datasets collected from Strava, including records of individual cycling activities, segment characteristics, and club-level information. The analysis covers the diverse geographic area of Quito and provides valuable contextual information for developing and evaluating group recommendations in a real-world sports environment.

4.2. Activity Records

The Activity Records dataset contains 18,441 activity sequences. Each record corresponds to a cyclist’s activity and includes four key features: an identifier (id, integer data type), a user identifier (user_id, integer data type), the start date (start_date, integer data type), and the sequence of visited segments represented as a dictionary with start_index and end_index entries. These data provide the foundation for understanding user behaviors and historical interaction patterns with route segments.

4.3. Segment Information

The Segment dataset comprises 5817 unique cycling segments. The nature of each segment is described by numerical and geographical characteristics, such as distance traveled, average gradient, elevation metric, climb category, total elevation gain, and various engagement metrics related to the participation and recognition of athletes (effort count, participant count, and star ratings). Geographical coordinates of start and end points allow for potential spatial analyses. Visualization, such as histograms and maps are powerful aids in uncovering numerical and spatial trends.

Figure 3 shows the spatial distribution of segments. The visual representation highlights a significant concentration of segments within the metropolitan district of Quito (urban area) and its environs, reflecting popular cycling zones. This geographic focus adds crucial contextual information for the interpretation of the data presented. The criteria for categorizing climbs on Strava include a minimum average gradient of 3.0%, a segment distance of at least 300 m, and a calculated value derived from the climb’s length and grade exceeding 8000. Then, for example, segments in Category 0, shown in red, correspond to flat or descending segments.

4.4. Clubs

The Clubs dataset comprises 1431 entries with five columns: id, sport_type, member_count, id_user, and a normalized identifier new_id. The member_count column denotes the number of members or group size, while sport_type categorizes activities into cycling, running, and triathlon. A total of 152 groups were identified based on shared club memberships. These groups are characterized as occasional groups since the members share a common objective during specific moments of engaging in sports. These groups represent gatherings driven by a common purpose, such as participating in a certain sporting activity. The diversity in their sizes and compositions underscores the dynamic nature of these cohorts, highlighting the multifaceted nature of collective engagement in sports at different points in time.

In Figure 4, the groups are visually depicted, showcasing their different sizes ranging from 2 to 23 users. This variation provides a clear overview of the organic distribution and purpose-driven formation of groups within fitness-oriented social networks.

4.5. Experimental Setup

To evaluate the performance of GoSS-Rec, we conducted an offline evaluation using information from the club dataset (Section 4.4). Groups were formed by aggregating users who belonged to the same club with a minimum group size of two members. This approach simulates realistic social groupings based on shared fitness goals. Users overlapping multiple clubs were more likely to cluster together. A total of 152 clusters were created, with each route containing an average of 10 segments, matching the average number of segments recorded per user activity recorded in the dataset.

We employed the Average Strategy as the aggregation approach to merge individual user ratings into a single group rating, representing collective group preferences. To conduct the experiments, historical activity sequences were divided into training and testing sets. The results demonstrate the ability of the system to generate relevant suggestions based on historical data. This offline evaluation highlights the effectiveness of the proposed models and algorithms in delivering personalized group recommendations.

Hyperparameter configurations were determined for each recommendation model based on a combination of prior literature and limited empirical tuning. In the case of models such as Prod2Vec (sensitive to parameter choices), we adopted standard configurations commonly used in the related work [27], and applied light adjustments to better fit the characteristics of our dataset. The parameters used in the experiments are summarized as follows. PopRec does not involve any hyper-parameters. For MDP, the minimum order is set to 1 and the maximum order is also set to 1. FPMC employs 2 latent factors and undergoes training for 5 epochs. Prod2vec uses a minimum item frequency of 2, embedding size of 5, window size of 5, and an exponential decay of 0.9. KNN uses a value of k = 10 for the nearest neighbors and employs Jaccard similarity. SASREC is configured with two self-attention blocks, a learning rate of 0.001, a batch size of 128 and dropout of 0.2.

We will evaluate these methods using the previously defined quality measures (Section 3.5). The experiment has been conducted for two group sizes and mean segments of group:

(a): Groups with 2 to 5 users, to test the performance of each recommendation method on small groups of users.
(b): Groups with 6 to 23 users, to test the performance of each recommendation method on large groups of users.
(c): All groups with an average of 10 segments per activity, representing the average number of segments per user activity in the dataset.

These scenarios were designed to investigate how group size and route complexity influence the recommendation performance of the models.

5. Results

Table 2 summarizes the performance metrics obtained from experiments on three group configurations: i. small groups, consisting of 2 to 5 users, ii. large groups, consisting of 6 to 23 users, and iii. aggregated results, which consider all groups combined. These results are based on activities with an average of 10 segments per activity (route). The most significant findings from this analysis are as follows.

As shown in Figure 5, larger groups generally achieve higher NDCG@10 scores. This trend reflects the algorithm’s improved capacity to identify shared preferences as group size increases, while smaller groups introduce greater heterogeneity, which can challenge convergence in consensus-based recommendation.
Across small, large, and all group categories, Prod2vec achieves the highest novelty scores (Figure 6). This indicates that Prod2vec effectively recommends segments that users have not previously encountered, enhancing exploratory behavior and reducing the popularity bias typical of many recommendation algorithms. Its embedding-based architecture captures fine-grained relationships between segments, which enables it to diversify recommendations beyond commonly visited paths. This is particularly valuable in sports scenarios, where maintaining engagement through discovery of new routes is critical to user satisfaction and long-term, available online: Strava Segments Prod2Vec (accessed on 9 July 2025).
SASRec and KNN achieve top performance in NDCG@10, indicating their strength in optimizing short-term ranking accuracy. In detail, for small groups, KNN leads with NDCG@10 = 0.7816. For large groups, Prod2vec slightly outperforms SASRec (0.8375 vs. 0.7818), but SASRec maintains consistent high scores across all groups. For all groups, KNN still leads with 0.7805, followed by Prod2vec (0.7755) and SASRec (0.7246). This suggests that KNN and SASRec excel at placing relevant segments at the top of recommendation lists, making them suitable when immediate relevance and short-term engagement are primary goals. While they may not offer the diversity or novelty of Prod2vec, their ranking optimization benefits applications where accuracy and user alignment are more important than exploration.
MDP outperforms others in diversity for small and all groups, indicating the impact of probabilistic sequence modeling in capturing behavior variety. The findings report that for small groups, MDP achieves the highest diversity score of 0.2123, for all groups, MDP again leads with diversity of 0.2082, finally, for large groups, FPMC slightly outperforms MDP (0.2304 vs. 0.1839). The use of first-order Markov decision processes in MDP allows it to recommend items that are less frequently co-occurring but contextually relevant, resulting in greater intra-list variation. This makes MDP particularly effective for small or moderately diverse groups where user preferences diverge, and where static embedding methods may overfit to dominant behaviors. While its ranking accuracy is moderate, the model’s capability to foster diversity supports broader engagement and satisfaction in group settings with variable dynamics.

Figure 5. NDCG@10 for different experiments and models.

Figure 6. Novelty and diversity for different experiments and models.

5.1. Discussion

In this study, our team developed and evaluated the GoSS-Rec algorithm designed for generating group-specific route recommendations for cycling. According to the experimental results, the performance metrics of the Prod2vec model outperform other alternatives with regard to exploratory metrics, specifically for novelty, in groups of different sizes. On the other hand, KNN achieves strong NDCG performance across group sizes. This supports the distinction between models designed to optimize immediate relevance versus those promoting discovery and engagement. Furthermore, the algorithm ultimately generates a GPX route where the segments are fully connected (

C p

), as illustrated in Figure 7. This connectivity ensures seamless and practical route recommendations, enhancing the groups experience by providing ready-to-use cycling paths.

5.1.1. Novelty and Diversity in Groups

One notable finding from the results is the increased novelty and diversity observed in recommendations when the group exhibits greater variation in preferences and activity history. This effect can be attributed to several factors as explained next.

Increased Heterogeneity of Preferences. The diversity of preferences and interests among group members enhances the recommendation system’s ability to introduce novel items that cater to different tastes. Consequently, the system is more likely to suggest routes or segments that some members might not have previously considered, increasing novelty. Exploring new segments offers multiple benefits, such as diversifying route choices, enhancing the user experience by tailoring recommendations to group dynamics and past activities, and encouraging exploration and discovery of new paths. This, in turn, keeps members engaged and motivated to participate actively.
Rich Group Activities. A broader collective interaction history, covering a wide range of activities and routes, allows the recommendation algorithm to identify and suggest a more diverse set of segments. This contributes to greater variety in recommendations, enhancing the overall experience by promoting dynamic, less redundant and engaging route selection. A more extensive interaction history enables the system to capture nuanced patterns in group behavior, ensuring that suggestions remain relevant while also fostering exploration of new areas. These dynamics emphasize the importance of designing group-aware algorithms that not only maximize consensus but also support exploration and personalization at the group level.

5.1.2. Contributions of GoSS-Rec Algorithm on Sports Recommender Systems

The GoSS-Rec algorithm has the potential to make significant advancements in the field of intelligent sports recommender systems, particularly in the context of destination and route recommendations. By leveraging group-oriented recommendations, it enhances the experience of cyclist groups by offering novel and diverse route suggestions, as illustrated in Figure 6. These recommendations not only encourage the exploration of new routes and increase user engagement but also contribute to improved performance and overall well-being. Additionally, they enhance the overall user experience by fostering a sense of community and shared adventure among group members. GoSS-Rec effectively integrates sequence-awareness, group dynamics, and segment-level semantics to deliver personalized, diverse, and executable route recommendations.

The findings underscore the potential of the GoSS-Rec algorithm to transcend conventional sports scenarios, where physical presence is typically necessary, by effectively generating personalized recommendations for groups of users. This capability is particularly relevant in today’s digital age, given that the methodology supports emerging trends in remote fitness and social engagement. This provides a foundation for future systems that accommodate group interaction, diversity, and shared objectives in both synchronous and asynchronous contexts.

Additionally, the principles and methodologies developed in this project can be extended to other sports and recreational activities, broadening the scope and impact of group-oriented recommender systems. These results demonstrate that GoSS-Rec, empowered by models like Prod2vec, can significantly enhance the quality of group recommendations in fitness applications, ultimately fostering user satisfaction, adherence to routines, and motivation through socially aware and exploratory routes.

6. Conclusions and Future Work

This study addresses the problem of offering cycling route recommendations for groups of athletes engaged in fitness activities, being a critical yet underexplored dimension in the domain of Online Social Fitness Networks (OSFNs). The development of the GoSS-Rec, a group-oriented sequence-aware recommender system, represents a significant step forward in enhancing the user experience for these athletes, as it offers group-level personalized route recommendations that encourage the discovery of new cycling paths.

The development of GoSS-Rec was grounded in a comprehensive literature review, which was conducted to explore the intricacies of group recommender systems (GRSs) and sequence-aware recommender systems within the context of route recommendations. This theoretical foundation informed the design of an experimental framework, supported by the extraction and preprocessing of real-world cycling data collected via the Strava API. Beyond the extraction of relevant information, this process laid the foundation for user profiling and contextual modeling of route segments, that are essential components for producing personalized group-based recommendations that address the diverse preferences of cycling communities. GoSS-Rec introduces a novel contribution to intelligent recommender systems in the sports domain, offering a foundation for the development of socially aware, health-promoting technologies that enhance group activity planning and personalized engagement.

To benchmark the effectiveness of GoSS-Rec, we compared six baseline recommendation methods across small, large, and all-group categories using a multidimensional evaluation framework based on NDCG@10, novelty, and diversity (Table 2). Prod2Vec, when integrated into GoSS-Rec, consistently outperformed all baseline models in terms of novelty, achieving scores of 2.8971 for small groups, 2.3056 for large groups, and 2.8115 across all groups. These correspond to improvements of +107.7%, +102.1%, and +106.9% over the second-best model (MDP), confirming GoSS-Rec’s strong ability to recommend previously unexplored segments and foster route discovery. In contrast, KNN and SASRec demonstrated competitive performance in ranking quality, as measured by NDCG@10. Specifically, KNN obtained the highest score for small groups (0.7816), outperforming Prod2Vec (0.7650) by +2.2%, while Prod2Vec led for large groups (0.8375), exceeding SASRec (0.7818) by +7.1%. For all groups combined, KNN remained in the lead with 0.7805, followed by Prod2Vec (0.7755) and SASRec (0.7246). These results highlight the trade-offs between novelty and ranking precision in group-based recommendation: while Prod2Vec is more effective in enhancing diversity and exploration, KNN and SASRec offer advantages in short-term relevance and accuracy, making them suitable for different recommendation objectives.

The study contributes a scalable and deployable architecture for intelligent group routing, capable of producing coherent and connected GPX routes tailored to collective cycling behavior. Sequence-aware modeling and group dynamics in GoSS-Rec address both shared preference alignment and the need for route novelty, which are essential factors for enhancing user engagement and sustained participation in fitness activities. The successful execution of our approach underscores the project’s significance as a noteworthy contribution to the field.

Looking ahead, the deployment of a full-featured application integrated with platforms like Strava represents a natural next step. Such an application would enable live group planning, real-time feedback loops, and the longitudinal tracking of behavioral responses to recommendations. These capabilities open avenues for studying user engagement, satisfaction, and adoption patterns, ultimately validating the impact of group-oriented recommendation models in real-world settings.

Furthermore, the principles underlying GoSS-Rec are not limited to cycling. The core methodology is generalizable to other collaborative sports and recreational domains, such as hiking, running, or even virtual training sessions. Future work will explore cross-sport adaptation, context-aware enhancements (e.g., weather, terrain), and explainable recommendations, enriching the transparency and adaptability of intelligent fitness platforms. In addition, the concept of learning from a group’s historical behavior to guide future decisions extends beyond the recreational domain. For instance, Ref. [28] analyzed industrial accident records across EU countries and demonstrated that modeling group-level patterns can help prevent future incidents and improve overall system safety. This supports the broad relevance of group-oriented models. Looking ahead, systems like GoSS-Rec may also be applied in more critical scenarios, such as coordinated emergency response or robot-assisted logistics. As Morgan and Grabowski [29] showed, human-machine collaboration in mobile group systems can enhance both safety and efficiency. These examples suggest promising directions for adapting GoSS-Rec to other high-stakes group contexts.

As part of future work, we plan to address the challenges posed by cold-start scenarios in GoSS-Rec, particularly given the long-tailed nature of behavioral data, which can negatively impact the performance of models such as Prod2Vec. One promising direction is to adopt hybrid approaches that combine behavioral embeddings with content-based representations, such as those obtained from textual metadata using models like Sentence-BERT. In this context, we acknowledge the relevance of The Embeddings That Came in From the Cold [30], which presents an elegant and scalable method for enriching embeddings for rare or new items without requiring major changes to the underlying training pipeline. Integrating such techniques may offer a practical improvement in the robustness and generalization capability of our system.

Author Contributions

Conceptualization, M.A., L.R. and E.L.-A.; methodology, L.R. and E.L.-A.; software, M.A.; validation, L.R. and E.L.-A.; formal analysis, M.A. and L.R; investigation, M.A., L.R. and E.L.-A.; resources, M.A.; data curation, M.A.; writing—original draft preparation, M.A.; writing—review and editing, M.A., L.R. and E.L.-A.; visualization, M.A.; supervision, L.R. and E.L.-A.; project administration, L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

OSFNs	Online Social Fitness Networks
RSs	Recommender Systems
GRS	Group Recommender Systems
NDCG	Normalized Discounted Cumulative Gain

References

Rivers, D.J. Strava as a discursive field of practice: Technological affordances and mediated cycling motivations. Discourse Context Media 2020, 34, 100345. [Google Scholar] [CrossRef]
Lupton, D.; Pink, S.; Heyes LaBond, C.; Sumartojo, S. Digital traces in context: Personal data contexts, data sense, and self-tracking cycling. Int. J. Commun. 2018, 12, 647–666. [Google Scholar]
Carter, S.; Green, J.; Speed, E. Digital technologies and the biomedicalisation of everyday activities: The case of walking and cycling. Sociol. Compass 2018, 12, e12572. [Google Scholar] [CrossRef]
Alcaraz-Herrera, H.; Cartlidge, J.; Toumpakari, Z.; Western, M.; Palomares, I. EvoRecSys: Evolutionary framework for health and well-being recommender systems. User Model. User Adapt. Interact. 2022, 32, 883–921. [Google Scholar] [CrossRef]
Wu, M.; Kolen, J.; Aghdaie, N.; Zaman, K.A. Recommendation applications and systems at electronic arts. In Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; p. 338. [Google Scholar]
Smyth, B.; Lawlor, A.; Berndsen, J.; Feely, C. Recommendations for marathon runners: On the application of recommender systems and machine learning to support recreational marathon runners. User Model. User Adapt. Interact. 2022, 32, 787–838. [Google Scholar] [CrossRef] [PubMed]
Ni, J.; Muhlstein, L.; McAuley, J. Modeling Heart Rate and Activity Data for Personalized Fitness Recommendation. In Proceedings of the The World Wide Web Conference (WWW ’19), San Francisco, CA, USA, 13–17 May 2019; pp. 1343–1353. [Google Scholar] [CrossRef]
Ivanova, I.; Wald, M. Recommender Systems for Outdoor Adventure Tourism Sports: Hiking, Running and Climbing. Hum. Centric Intell. Syst. 2023, 3, 344–365. [Google Scholar] [CrossRef]
Felfernig, A.; Wundara, M.; Tran, T.N.T.; Le, V.M.; Lubos, S.; Polat-Erdeniz, S. Sports recommender systems: Overview and research directions. J. Intell. Inf. Syst. 2024, 62, 1125–1164. [Google Scholar] [CrossRef]
Pilloni, P.; Piras, L.; Boratto, L.; Carta, S.; Fenu, G.; Mulas, F. Recommendation in persuasive eHealth systems: An effective strategy to spot users’ losing motivation to exercise. In Proceedings of the CEUR Workshop Proceedings, Como, Italy, 27–31 August 2017; Volume 1953, pp. 6–9. [Google Scholar]
Sanchez, F.; Alduan, M.; Alvarez, F.; Menéndez, J.M.; Baez, O. Recommender system for sport videos based on user audiovisual consumption. IEEE Trans. Multimed. 2012, 14, 1546–1557. [Google Scholar] [CrossRef]
Roanes-Lozano, E.; Casella, E.A.; Sánchez, F.; Hernando, A. Diagnosis in tennis serving technique. Algorithms 2020, 13, 106. [Google Scholar] [CrossRef]
Avesani, P.; Massa, P.; Tiella, R. A trust-enhanced recommender system application: Moleskiing. In Proceedings of the 2005 ACM Symposium on Applied Computing, Santa Fe, Mexico, 13–17 March 2005; pp. 1589–1593. [Google Scholar]
McCarthy, K.; Salamó, M.; Coyle, L.; McGinty, L.; Smyth, B.; Nixon, P. Group recommender systems: A critiquing based approach. In Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, 29 January–1 February 2006; pp. 267–269. [Google Scholar]
Wirz, M.; Strohrmann, C.; Patscheider, R.; Hilti, F.; Gahr, B.; Hess, F.; Roggen, D.; Tröster, G. Real-time detection and recommendation of thermal spots by sensing collective behaviors in paragliding. In Proceedings of the 1st International Symposium on from Digital Footprints to Social and Community Intelligence, Beijing, China, 18 September 2011; pp. 7–12. [Google Scholar]
Shani, G.; Heckerman, D.; Brafman, R.I.; Boutilier, C. An MDP-based recommender system. J. Mach. Learn. Res. 2005, 6, 1265–1295. [Google Scholar]
Rendle, S.; Freudenthaler, C.; Schmidt-Thieme, L. Factorizing Personalized Markov Chains for Next-Basket Recommendation. In Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 811–820. [Google Scholar] [CrossRef]
Quadrana, M.; Jannach, D.; Cremonesi, P. Tutorial: Sequence-Aware Recommender Systems. In Proceedings of the Companion Proceedings of The 2019 World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; Association for Computing Machinery: New York, NY, USA, 2019; p. 1316. [Google Scholar] [CrossRef]
Kang, W.C.; McAuley, J. Self-attentive sequential recommendation. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; pp. 197–206. [Google Scholar]
Gunawardana, A.; Shani, G.; Yogev, S. Evaluating Recommender Systems. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: New York, NY, USA, 2022; pp. 547–601. [Google Scholar] [CrossRef]
Sun, F.; Liu, J.; Wu, J.; Pei, C.; Lin, X.; Ou, W.; Jiang, P. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1441–1450. [Google Scholar] [CrossRef]
McAuley, J. Personalized Machine Learning; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar] [CrossRef]
Castells, P.; Hurley, N.; Vargas, S. Novelty and Diversity in Recommender Systems. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: New York, NY, USA, 2022; pp. 603–646. [Google Scholar] [CrossRef]
Zhou, T.; Kuscsik, Z.; Liu, J.G.; Medo, M.; Wakeling, J.R.; Zhang, Y.C. Solving the apparent diversity-accuracy dilemma of recommender systems. Proc. Natl. Acad. Sci. USA 2010, 107, 4511–4515. [Google Scholar] [CrossRef] [PubMed]
Kaminskas, M.; Bridge, D. Diversity, Serendipity, Novelty, and Coverage: A Survey and Empirical Analysis of Beyond-Accuracy Objectives in Recommender Systems. ACM Trans. Interact. Intell. Syst. 2016, 7, 1–42. [Google Scholar] [CrossRef]
Smyth, B.; McClave, P. Similarity vs. Diversity. In Case-Based Reasoning; Springer: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
Grbovic, M.; Radosavljevic, V.; Djuric, N.; Bhamidipati, N.; Savla, J.; Bhagwan, V.; Sharp, D. E-commerce in your inbox: Product recommendations at scale. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, 10–13 August 2015; pp. 1809–1818. [Google Scholar]
Katarina, H.; Samuel, K. Statistical survey on the prevention of major industrial accidents (MIAs) in the EU Member States in 2000–2020. J. Saf. Sustain. 2025, 2, 72–80. [Google Scholar] [CrossRef]
Morgan, G.; Grabowski, M.R. Human Machine Teaming in Mobile Miniaturized Aviation Logistics Systems in Safety-Critical Settings. J. Saf. Sustain. 2025, 2, 22–31. [Google Scholar] [CrossRef]
Tagliabue, J.; Yu, B.; Bianchi, F. The Embeddings That Came in From the Cold: Improving Vectors for New and Rare Products with Content-Based Inference. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual Event, Brazil, 22–26 September 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 577–578. [Google Scholar] [CrossRef]

Figure 1. Conceptual definition of the group recommender system GoSS-Rec.

Figure 2. Conceptual group recommender system GoSS-Rec.

Figure 3. Segment climb categories.

Figure 4. Group size distribution.

Figure 7. Resulting GoSS-Rec algorithm.

Table 1. Notation.

Notation	Description
u	user
$g_{k}$	group of users
s	segment
t	time stamp
n	number of segments in the route
D	dataset of routes
U	dataset of users
S	dataset of segments
P	sequence of segments in route
$R (g_{k})$	list of recommended segments for the group (route)
$C p$	final route recommendation
$D (g_{k})$	set of group routes

Table 2. Results of recommendation to groups.

Group	Metric	PopRec	MDP	FPMC	Prod2vec	KNN	SASREC
Small	NDCG@10	0.6324	0.6399	0.5838	0.7650	0.7816	0.6962
Groups	Novelty	0.6598	1.3951	1.1516	2.8971	1.3048	0.7459
	Diversity	0.0976	0.2123	0.1960	0.1694	0.1785	0.1785
Large	NDCG@10	0.7802	0.7626	0.7112	0.8375	0.7743	0.7818
Groups	Novelty	0.6756	1.1405	0.9855	2.3056	1.1636	0.7014
	Diversity	0.0928	0.1839	0.2304	0.1491	0.1671	0.1671
All	NDCG@10	0.6538	0.6576	0.6022	0.7755	0.7805	0.7246
Groups	Novelty	0.6621	1.3582	1.1275	2.8115	1.2844	0.7394
	Diversity	0.0969	0.2082	0.2010	0.1664	0.1768	0.1768

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aguirre, M.; Recalde, L.; Loza-Aguirre, E. GoSS-Rec: Group-Oriented Segment Sequence Recommendation. Information 2025, 16, 668. https://doi.org/10.3390/info16080668

AMA Style

Aguirre M, Recalde L, Loza-Aguirre E. GoSS-Rec: Group-Oriented Segment Sequence Recommendation. Information. 2025; 16(8):668. https://doi.org/10.3390/info16080668

Chicago/Turabian Style

Aguirre, Marco, Lorena Recalde, and Edison Loza-Aguirre. 2025. "GoSS-Rec: Group-Oriented Segment Sequence Recommendation" Information 16, no. 8: 668. https://doi.org/10.3390/info16080668

APA Style

Aguirre, M., Recalde, L., & Loza-Aguirre, E. (2025). GoSS-Rec: Group-Oriented Segment Sequence Recommendation. Information, 16(8), 668. https://doi.org/10.3390/info16080668

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

GoSS-Rec: Group-Oriented Segment Sequence Recommendation

Abstract

1. Introduction

2. Related Works

Sports Recommender Systems

3. Proposed Approach

3.1. System Design and Architecture

3.2. Preliminaries and Problem Statement

3.3. Group-Oriented Segment Sequence Algorithm

3.4. Sequence Model Recommendation

3.5. Evaluation Metrics

3.5.1. Sequence Metric

3.5.2. Novelty and Diversity

4. Experiments

4.1. Dataset

4.2. Activity Records

4.3. Segment Information

4.4. Clubs

4.5. Experimental Setup

5. Results

5.1. Discussion

5.1.1. Novelty and Diversity in Groups

5.1.2. Contributions of GoSS-Rec Algorithm on Sports Recommender Systems

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI