A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum

Wu, Shiqi; Diao, Mingguang; Wang, Jingwen; Song, Zihan; Zhang, Chuyan

doi:10.3390/app15105624

Open AccessArticle

A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum

by

Shiqi Wu

,

Mingguang Diao

^*

,

Jingwen Wang

,

Zihan Song

and

Chuyan Zhang

^*

School of Information Engineering, China University of Geosciences Beijing, Beijing 100083, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(10), 5624; https://doi.org/10.3390/app15105624

Submission received: 15 April 2025 / Revised: 12 May 2025 / Accepted: 16 May 2025 / Published: 18 May 2025

(This article belongs to the Collection Computer Science in Sport)

Download

Browse Figures

Versions Notes

Abstract

In tennis matches, “momentum” describes the situation where players are inspired by positive factors during the game. This paper focuses on the quantification of momentum, the impact of momentum on match trends, and the optimisation of players’ match strategies based on momentum. In this research, the Markov chain is used to quantify the momentum, and the “momentum score” of players is obtained. The Eta and Spearman correlation coefficients are used to study the impact mechanism of momentum on match trends. The results show that, within a 95% confidence interval, momentum has a strong correlation with the results of the current game of players, but a weak correlation with the result of each current point. In addition, in this paper, we construct a match strategy optimisation model for players based on momentum scores. On the one hand, the entropy weight method and TOPSIS are combined to evaluate players’ performances. On the other hand, a BP neural network prediction model is established based on a multiple linear regression model with 13 indicators in 10 categories to predict match trends. According to the evaluation and prediction results, a series of strategy optimisation suggestions are put forward for players to cope with matches.

Keywords:

momentum; Markov chain; Eta correlation; multiple linear regression; BP neural network; match strategy optimisation

1. Introduction

In the context of tennis matches, the momentum of players has been shown to exert a substantial influence on the outcomes of these contests. This factor has been demonstrated to play a pivotal role in the shaping of players’ individual performances [1]. The influence of momentum is evident not only in the strategic decisions of players [2,3], but also in its substantial impact on their psychological state [4]. Specifically, the psychological benefits of momentum include enhanced confidence, heightened competitive motivation, and improved concentration, thereby exerting psychological pressure on opponents. From a match situation perspective, momentum has the potential to enhance the accuracy of pre-match prediction models. The gain or loss of a player’s previous point has been shown to significantly affect their mentality, tactical choices, and subsequent shot performance, thus influencing the result of the next point. The integration of player momentum with conventional statistical methodologies has been demonstrated to enhance the predictive capacity of tennis match analysis [5]. Furthermore, the serving side, break points, and winning streaks play a pivotal role in the fluctuations in momentum. The serving side has been shown to exert significant influence over the match tempo through the implementation of high-quality serves and in-game dominance [6]. The break-point moment is characterised by a significant shift in the match situation [7]. Finally, winning streaks have been shown to provide a psychological advantage through consecutive scoring, which is critical for changing momentum [8]. Consequently, a system for quantifying momentum can be developed based on the dynamic characteristics of tennis matches, serving situations, break points, and winning streaks.

The construction of an accurate and practical tennis match analysis model based on momentum is proposed in this paper. The model is rendered effective, its structure is reasonably simplified, and its operational efficiency and interpretability are improved. To achieve this objective, it is necessary to address three fundamental issues. Firstly, the quantification of players’ momentum during the match is imperative. The establishment of a unified paradigm for momentum quantification will facilitate effective comparison and integration of research results. Secondly, determining the correlation between momentum and match trends is essential. While extant studies have indicated an association between players’ momentum and match trends, they have lacked systematic and comprehensive analysis [9]. The third issue pertains to the utilisation of momentum in proposing optimisation recommendations for players’ match strategies. Utilising momentum to analyse players’ performance at different stages and predict match trends enables the provision of targeted optimisation suggestions based on players’ specific situations, thus facilitating enhanced outcomes.

In response to the aforementioned issues, this paper proposes a tennis match analysis system that is based on momentum, as illustrated in Figure 1, from the aspects of data processing, quantification of tennis players’ momentum, correlation analysis, construction of strategy optimisation models, and model testing, based on the data from the men’s singles matches at Wimbledon in 2023. The main research contents are as follows:

The development of a quantification system for tennis players’ momentum. Initially, data cleansing is conducted to guarantee the precision and thoroughness of the data. The subsequent integration of the Markov chain [10] involves the utilisation of its state-transition characteristics for the purpose of capturing the dynamic changes in the match state. Concurrently, multi-dimensional indicators with a high degree of relevance to the match are selected for the purpose of comprehensively constructing a dynamic momentum quantification system, with the objective of accurately describing the changes in players’ momentum during the match;
The subsequent objective is to reveal the influence mechanism of players’ momentum on match trends. Firstly, the types of research data are to be determined as continuous variables and categorical variables, and the Eta correlation coefficient method [11] is to be used to analyse their correlations. Subsequently, the robustness of the research results will be ensured by verifying the correlations from different angles using the Spearman correlation coefficient [12];
The construction of a player match strategy optimisation model based on momentum is imperative. The entropy weight method [13] is used to assign weights to the quantified momentum and other indicators. The weights are determined based on the contribution of each indicator to the player’s performance. The TOPSIS method [13] is then used to calculate the player’s final performance score, thereby evaluating their performance in the match. Concurrently, a multiple linear regression model [14] is employed to analyse the key indicators affecting the match trends. The establishment of a BP neural network prediction model [15] is then informed by these indicators. Through the implementation of non-linear fitting, the match trends are predicted, and the accuracy and stability of the model are evaluated. Finally, based on the analysis results from the above steps, targeted match-related suggestions are provided to help players optimise their strategies and improve their performance.

2. Construction of the Tennis Players’ Momentum Quantification System

In the construction of the player momentum quantification model, the actual conditions of tennis matches are given full consideration as the basis for model development. The tennis matches are divided into three distinct units: game, set, and match. A game concludes when a player attains a total of four points, with points being awarded in various ways. A set consists of multiple games and ends when a player wins six games (or more). The outcome of a match is determined by the best of three or five sets. It is noteworthy that championship matches in the men’s singles category are typically played over five sets [16].

2.1. Data Feature Analysis

The original dataset is derived from the men’s singles matches at Wimbledon in 2023, subsequent to the second round [17]. The dataset is substantial in size, comprising 7284 rows and 46 columns. The data types are diverse and can be categorised into three groups: basic match information, player performance statistics, and match event details. The basic match information provides a macro-level view of the match’s progression, while the player performance statistics comprehensively reflect the players’ skill levels and technical characteristics. Furthermore, the match event details are useful for analysing key moments and players’ tactical choices (see Table 1).

A comparative analysis of the data from each match revealed the presence of anomalous time values in match No. 1303. Furthermore, there are numerous missing values for serving speed and number of strokes in some matches, such as matches No. 1310 and 1311. In order to mitigate the impact of missing values and outliers, this paper selects only the matches with complete data and no abnormalities for further research and analysis.

Furthermore, a unification dimension operation was performed on the scoring data of each game. The scores are standardised as the number of winning points in each game: 15 is recorded as 1, 30 as 2, 40 as 3, and AD as 4.

2.2. Construction of the Dynamic Momentum Quantification System

In light of the divergent psychological attributes and personal capabilities exhibited by players, the transfer probability matrix of the Markov chain is employed to dynamically quantify the momentum of tennis players. In the Markov chain model, the probability of the future state is dependent solely on the current state, and is independent of past states. The Formula (1) is as follows:

P (X_{n + 1 = j | X_{n - 1} = i_{n - 1}, \dots, X_{0} = i_{0}}) = P (X_{n + 1} = j| X_{n} = i) = P_{i j},

(1)

where

X_{n + 1}

represents the future state, i.e., the win or loss of the next point in a tennis match,

X_{n}

represents the current state, i.e., the win or loss of the current point, and

P_{i j}

is the transition probability, which satisfies

\sum_{j = 1}^{m} P_{i j} = 1

for all i.

In the context of match No. 1701, a detailed analysis was conducted to ascertain the probability of player 1 winning this point after losing the previous point (

P_{01}

). Through the utilisation of appropriate calculation methods, it was determined that the probability of player 1 winning this point after losing the previous point is 0.5060. Similarly, the probability of player 1 winning this point after winning the previous point (

P_{11}

) was calculated to be 0.5030. Furthermore, the probability of player 2 winning this point after losing the previous point (

P_{01}

) was determined to be 0.4970, and the probability of player 2 winning this point after winning the previous point (

P_{11}

) was found to be 0.4940.

The calculation formulas for “momentum” are thus summarised in Formulas (2)–(4) by combining the calculation results from the above Markov chain with data on the serving side, break points, and winning streaks.

M S_{P 1} (t) = \{\begin{array}{l} (M S_{P 1} (t - 1) + P_{s e r v e r_w i n} + T) \times S, i = 1 \\ (∆ C P_{12} + T) \times S, i = 0 \end{array},

(2)

T = \{\begin{array}{l} P_{b r e a k_p o i n t_w i n}, j = 1 \\ 0, j = 0 \end{array},

(3)

S = \{\begin{array}{l} P_{01}, q = 1 \\ P_{11}, q = 0 \end{array},

(4)

where the meaning of each parameter is presented in Table 2.

The application of the aforementioned formulae to the substituted data from match No. 1701 yields the “momentum score”, a pivotal metric in the assessment of player performance. The results are illustrated in Figure 2.

As demonstrated in Figure 2, the momentum scores of both players exhibit fluctuations throughout the tennis match, thereby reflecting the intensity of the competition and the dynamic interplay between the two athletes during each point. The fluctuations in momentum scores thus serve to highlight significant moments and critical junctures within the match. Furthermore, when a player’s momentum score consistently exceeds that of their opponent, it can be deduced that said player may exert a greater degree of influence over the match’s rhythm during that particular period. The analysis of the results indicates a close alignment between the fluctuating characteristics of the momentum scores and the actual match events. For instance, the momentum score advantage of Player 1 at the commencement of the fourth game in the fifth set directly corresponds to their actual performance of winning four consecutive points, thereby validating the effectiveness of the momentum score in quantifying the ability to control the match rhythm. The balanced fluctuations in momentum scores during the fifth game of the third set accurately reflect the true intensity of the confrontation during the stalemate stage between the two players. This finding underscores the reliability of the momentum quantification model based on the Markov chain in capturing pivotal turning points in the match, thereby providing a robust dynamic foundation for strategic optimisation.

3. The Mechanism of How Player Momentum Affects Match Trends

The examination of the relationship between momentum scores and match trends has been shown to facilitate a more profound comprehension of the role of momentum in tennis matches. This section commences with an exploration of the strength of the association between continuous and categorical variables, employing Eta correlation analysis following the determination of the data types. Subsequently, the Spearman correlation coefficient is utilised to undertake a consistency test, thereby ensuring the robustness of the analysis outcomes.

3.1. The Association Strength Between Continuous and Categorical Variables

Momentum is a continuous variable, represented by momentum scores, while match trends are categorical variables, represented by the results of each match and point. In this study, one match without outliers from each round was selected for correlation analysis. Eta correlation analysis was used to examine the strength of the association between continuous and categorical variables. The calculation formula is given in Equation (5):

ε = \sqrt{\frac{\sum {(m - \bar{m})}^{2} - \sum {(m - \bar{m_{k}})}^{2}}{\sum {(m - \bar{m})}^{2}}},

(5)

where

\bar{m}

is the mean of the continuous variable,

\bar{m_{k}}

is the mean of the continuous variable within each group of the categorical variable, and ε is the eta correlation coefficient. The results of substituting the data from each game into the calculation formula are presented in Table 3.

The higher the Eta coefficient value, the stronger the correlation. As demonstrated in the data presented in Table 3, the cumulative momentum score for each game exhibits a strong correlation with the player’s win/loss changes in that game. Conversely, the momentum score for each point demonstrates a weak correlation with the player’s win/loss changes at that specific point.

3.2. Consistency Test for the Correlation Between Player Momentum and Match Trends

The Spearman correlation coefficient is a non-parametric statistical method used to measure the correlation between two variables. It is applicable to all types of data, including both ordinal and non-ordinal data. In this study, the Spearman correlation coefficient formula was employed to calculate and test the correlation results obtained from the Eta coefficient. The calculation results and significance test results are presented in Table 4.

As demonstrated in Table 4, the results of the significance test indicate that p < 0.05, signifying that the null hypothesis is rejected at the 95% confidence level, and the significance test is passed. The magnitude of the Spearman correlation coefficient is indicative of the strength of the correlation between the two variables. The calculation results in the table demonstrate a strong correlation between the cumulative momentum score for each game and the player’s win/loss changes in that game, while the momentum score for each point exhibits a weaker correlation with the player’s win/loss changes at that point. This finding is consistent with the results obtained from the Eta correlation analysis, thus validating the accuracy of the Eta correlation analysis.

Consequently, based on the aforementioned correlation analyses, it can be concluded that cumulative momentum can reflect the match trends to a certain extent. The success of players in a match is not entirely random and can be optimised through momentum-related strategies.

4. Construction of a Momentum-Based Player Match Strategy Optimisation Model

A momentum-based player match strategy optimisation model was constructed for the purpose of quantitatively evaluating player performance across multiple dimensions. By leveraging effective predictive models to accurately capture key changes during matches, this system provides a deeper understanding of players’ behavioural patterns during matches. The model also offers recommendations for adjusting match strategies, thereby enhancing the scientific rigor and practical utility of tennis match analysis.

4.1. Evaluation of Player Match Performance

In order to evaluate a player’s match performance in a comprehensive and objective manner, the entropy weight method is employed to allocate weights to key indicators, and the TOPSIS method is used to calculate the player’s performance score.

4.1.1. Weight Allocation for Key Indicators

The entropy weight method is an objective weight calculation approach based on the principle of information entropy. This method is predominantly employed to evaluate the likelihood or significance of a specific state materialising among a multitude of independent potential states. Through data observation and analysis, seven key indicators are selected and categorised into positive and negative indicators. These two types of indicator data are then normalised, as shown in Table 5, where

X_{m i n} = \min (X_{1 j}, X_{2 j}, \dots, X_{n j}) - 0.0001, X_{m a x} = \max (X_{1 j}, X_{2 j}, \dots, X_{n j}) + 0.0001

.

Pursuant to the results of the normalisation process set out in Table 5, the entropy value e_j and the information entropy redundancy d_j for the j-th indicator are calculated, as demonstrated in Equations (6) and (7), respectively.

e_{j} = - k \sum_{i = 1}^{n} P_{i j} \ln (P_{i j}), j = 1, \dots, m,

(6)

d_{j} = 1 - e_{j}, j = 1, \dots, m,

(7)

Ultimately, the weight proportion of each indicator is determined using the indicator weight formula, thereby establishing the importance ranking of each influencing factor. The indicator weight formula is presented in Equation (8).

w_{j} = \frac{d_{j}}{\sum_{j = 1}^{m} d_{j}}, j = 1, \dots, n,

(8)

The entropy weight method results for Players P1 and P2 are presented in Table 6.

As demonstrated in Table 6, it can be concluded that the weights attributed to Ace and Winner were relatively elevated, signifying that these two metrics more accurately reflected players’ performance in the match. This is attributable to the elevated degree of data dispersion observed for Ace and Winner, in conjunction with their pronounced capacity to differentiate between players’ skill levels. Conversely, the Momentum and Unf_err, which exhibit intermediate weight values, were found to be significant in evaluating player performance from the perspectives of match dynamics and technical stability. Conversely, Double_fault, Rally_count, and Dtop, which possess lower weights, demonstrated greater randomness and thus served as supplementary factors in the comprehensive evaluation of player performance.

4.1.2. Calculation of Player Performance Scores

In accordance with the preceding analysis, the weights of each indicator were determined by means of the entropy weight method, where the weight of the j-th indicator is denoted as

w_{j}

. Subsequently, the TOPSIS method was employed to calculate the differences between the evaluation object and the optimal and worst vectors, thereby assessing the relative performance of the evaluation object.

Utilising the normalised data for each indicator, the optimal solution is denoted as

(Z_{1}^{+}, Z_{2}^{+}, \dots, Z_{m}^{+})

, and the worst solution is denoted as

(Z_{1}^{-}, Z_{2}^{-}, \dots, Z_{m}^{-})

. The distances between each evaluation indicator and the optimal and worst vectors were calculated using the expression shown in Equation (9).

D_{i}^{+} = \sqrt{\sum_{j = 1}^{m} w_{j} {(Z_{j}^{+} - z_{i j})}^{2}}, D_{i}^{-} = \sqrt{\sum_{j = 1}^{m} w_{j}} {(Z_{j}^{-} - z_{i j})}^{2},

(9)

The proximity of the evaluation object to the optimal solution is determined using Equation (10), which is based on the calculated distances.

C_{i} = \frac{D_{i}^{-}}{D_{i}^{+} + D_{i}^{-}},

(10)

It can be demonstrated that an elevated value of

C_{i}

is indicative of the evaluation object being in closer proximity to the optimal solution. The performance scores were obtained through the substitution of the normalised data and entropy weights, as illustrated in Figure 3, utilising the match data from 1701 as a case study.

As demonstrated in Figure 3, there is a clear correlation between the fluctuations in performance scores and actual match events. For instance, the maximum scores achieved by Player P1 correspond to the completion of a break point and an ace at the 15th and 96th points, respectively, while the minimum score attained by Player P2 is concomitant with an error at the 147th point. These observations are consistent with the technical actions and outcomes during the match. This demonstrates that the entropy weight-TOPSIS evaluation model can objectively reflect players’ real-time performance levels. The fluctuations in scores not only reveal changes in player states but can also be integrated with match trends prediction models to provide more precise insights into in-match strategy optimisation.

4.2. Prediction of Match Trends

The employment of a multiple linear regression model to identify key indicators, and the subsequent utilisation of a BP neural network based on these indicators, facilitates the accurate prediction of match trends. This, in turn, enables players to optimise their match strategies in a targeted manner.

4.2.1. Key Indicators Influencing Match Trends

In order to capture fluctuations in match trends, the difference in momentum scores between Player P1 and Player P2 is used as the dependent variable. A positive difference indicates that Player P1 is in a dominant position, while a negative difference suggests that Player P2 holds the advantage. The point at which the difference switches from positive to negative, or vice versa, represents the transition of dominance between Player P1 and Player P2. The magnitude of this transition is quantified by the absolute value of the difference between the momentum scores of the two players.

This study proposes a multiple linear regression equation, predicated on the differences between Player P1 and Player P2 in both quantitative indicators (rally_count, distance_run, speed_mph, and delta_point) and qualitative indicators (ace, winner, double_fault, unforced_error, net_pt, net_pt_won, serve_width, serve_depth, return_depth, break_pt, break_pt_won, break_pt_missed, and winner_shot_type), with the difference in momentum scores between Player P1 and Player P2 as the dependent variable, to investigate the impact of each indicator on the match trends. The reference expression for the multiple linear regression equation is shown in Equation (11).

\hat{y} = β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{m} x_{m} + ε,

(11)

where y represents the dependent variable,

β_{0}

is the constant term,

β_{k}

is the regression coefficient of the k-th independent variable, and ε is the random error term. By substituting the data from this study into the analysis, the regression equation is derived, and a significance test is conducted on the regression equation [18]. The test formula is shown in Equation (12).

F = \frac{S S R / m}{S S E / (n - m - 1)} ~ F (m, n - m - 1),

(12)

where

S S R = {y_{i}}^{2}

represents the sum of squares due to regression and

S S E = \sum {(y_{i} - {\hat{y}}_{i})}^{2}

represents the sum of squares due to residuals. By substituting the match data without outliers, which were selected from each round of the match, into the analysis, it is found that under a 95% confidence interval, the p-value is 0.000 < 0.05. This outcome consequently leads to the rejection of the null hypothesis, thereby indicating that the regression equation is statistically significant. In order to further confirm the feasibility of this regression model, it is also necessary to test for heteroscedasticity and multicollinearity within the model.

4.2.2. BP Heteroskedasticity Test

The BP test [19] is employed to detect heteroscedasticity by analysing the relationship between the squared residuals and the explanatory variables. Initially, the original regression model is estimated to obtain the residual sequence

ε_{i}

. Subsequently, the squared residuals

ε_{i}^{2}

are used as the dependent variable, and an auxiliary regression is performed with the explanatory variables

x_{1}, x_{2}, \dots, x_{k}

in the model. The resulting auxiliary regression equation is shown in Equation (13).

ε_{i}^{2} = δ_{0} + δ_{1} x_{i 1} + δ_{2} x_{i 2} + \dots + δ_{k} x_{i k} + u_{i},

(13)

where

u_{i}

represents the residual term of the auxiliary regression. The R2 value of the auxiliary regression, denoted as

R_{new}^{2}

, is calculated and used to construct the BP test statistic. This statistic is typically represented as

n R_{new}^{2}

, where n is the sample size. Under the null hypothesis

H_{0} : δ_{1} = δ_{2} = \dots = δ_{k} = 0

(i.e., no heteroscedasticity exists), this statistic follows a chi-square distribution, as shown in Equation (14).

n R_{new}^{2} \overset{d}{\to} χ^{2} (K - 1),

(14)

where K − 1 represents the number of explanatory variables in the model. The obtained p-value of 0.000, which is less than 0.05, indicates that the null hypothesis is rejected at the 95% confidence level, suggesting the presence of heteroscedasticity in the disturbance term and rendering the statistical measures invalid. Given the operational simplicity of the robust standard error method and its demonstrated effectiveness in reliable statistical inference [20], this study employs robust standard errors to address heteroscedasticity, thereby obtaining the corresponding results.

4.2.3. Multicollinearity Test

In order to assess and quantify the degree of multicollinearity, researchers commonly utilise the variance inflation factor (VIF) as a statistical tool [21]. The formula for calculating VIF is shown in Equation (15):

V I F_{m} = \frac{1}{1 - R_{1 - k \ m}^{2}},

(15)

where

R_{1 - k \ m}^{2}

represents the goodness of fit obtained by regressing the m-th independent variable against the remaining k − 1 independent variables. It is important to note that an increase in the VIF value signifies a stronger correlation between the m-th variable and the other variables, which may result in greater variance inflation.

In practical applications, the VIF of a regression model is defined as the maximum VIF value among all independent variables. The calculation results reveal that four indicators have VIF values greater than 10, indicating a relatively severe multicollinearity problem in the regression equation. To address the issues arising from multicollinearity, this study employs the backward stepwise regression method. Initially, all variables are included in the model. Then, one independent variable is removed at a time, and the model is observed to determine whether there is a significant change in the dependent variable. Finally, the variable with the least explanatory power is eliminated through comparison. This process is repeated iteratively until all variables that meet the predetermined elimination criteria have been removed. During the regression process, robust standard errors and standardised regression coefficients are employed to enhance the precision with which key factors influencing match trends can be identified. The final selection of 13 indicators from the original 17 is determined by their correlation strength, as outlined below: delta_point, break_pt_won, net_pt_won, ace, break_pt, net_pt, double_fault, serve_width, winner_shot_type, unforced_error, distance, winner, and break_pt_missed. Among these, delta_point has the highest correlation, reaching 0.872.

4.2.4. Sensitivity Analysis

In order to analyse the sensitivity of the multiple linear regression model, the study repeatedly and randomly deleted 20 data points from the sample and re-performed backward stepwise regression in order to examine whether the standardised regression coefficients underwent significant changes. The results of this analysis are presented in Table 7.

As demonstrated in Table 7, following the random deletion of 20 data points from the dataset, the mean error between the test values and the original results in the eight instances calculated is less than 0.001. This finding suggests that minor alterations in sample size do not substantially impact the correlation outcomes of the indicators, thereby demonstrating the model’s stability.

4.2.5. Nonlinear Fitting for Predicting Match Trends

The calculations pertaining to the multiple linear regression equation previously discussed have yielded an adjusted coefficient of determination

{A d j R}^{2}

of 0.797. This finding suggests that the model possesses the capacity to make predictions. However, in order to enhance the model’s accuracy and generalisability, in this study, we established a BP neural network prediction model using the 13 selected indicators for prediction. The hyperbolic tangent sigmoid function (tansig) is employed as the activation function for the hidden layer neurons, while the linear function (purelin) is used as the activation function for the output layer. The mathematical expressions of the neural network are shown in Equations (16) and (17).

f (x) = W^{(o)} t a n s i g (W^{(h)} x + b^{(h)}) + b^{(o)},

(16)

t a n s i g : y = \frac{2}{1 + e^{- 2 x}} - 1,

(17)

where W represents the weight matrix, and x and b are vectors. The superscript o denotes the input layer, and h denotes the hidden layer. This study involved the training of multiple models and employed the Levenberg–Marquardt (LM) algorithm, a non-linear least-squares optimisation method, to train the parameters of the neural network, calculate and control errors, and ultimately select the optimal model. The expressions for calculating the mean-squared error (MSE) and root-mean-squared error (RMSE) are shown in Equation (18).

M S E = \frac{1}{m} \sum_{i = 1}^{m} {(y_{i} - \hat{y_{i}})}^{2}, R M S E = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(y_{i} - \hat{y_{i})}}^{2},}

(18)

In order to facilitate a comparison of the predictive performance of the BP neural network model and the multiple linear regression model, data from five competitions (ID: 1304, 1406, 1502, 1601, and 1701) were utilised for computational analysis. The multiple linear regression model yielded a mean-squared error (MSE) of 0.3268 and a root-mean-squared error (RMSE) of 0.5716. In comparison, the BP neural network prediction model achieved an MSE of 0.2006 and an RMSE of 0.4479. The findings suggest that the predicted values from the BP neural network model exhibit a stronger alignment with the actual values, thereby substantiating its superior predictive accuracy.

The establishment of a quantitative match trend evaluation system and optimisation strategies was predicated on the calculation of the cumulative momentum score difference by the model. A thorough analysis of the match data revealed that, when the absolute value of the cumulative momentum score difference between players fell below 1.0, the match entered a state of heightened contestation, characterised by balanced scoring opportunities. In this phase, players are advised to prioritise shot consistency and tactical diversity, utilising precise placement control and rhythm variation to overcome deadlocks. It is recommended that coaches provide guidance to athletes in the adoption of steady defence and counterattack strategies, with a focus on maximising first-serve success rates and leveraging frequent placement changes to exhaust opponents’ physical stamina.

In instances where the absolute cumulative momentum score differential exceeds 1.0, the player who is in an advantageous position is likely to sustain their dominance over the subsequent 3–5 scoring rounds, thereby consolidating their advantageous position. At this stage, the leading player should sustain offensive pressure by enhancing serve-and-volley tactics and proactive line-changing strategies to extend gains. It is imperative that coaches maintain a state of heightened awareness and vigilance against the potential for momentum reversals, precipitated by the strategic adjustments implemented by opponents. In order to mitigate the occurrence of unforced errors during periods that are conducive to the team’s success, it is crucial that players adhere to the principles of technical discipline.

Should the cumulative momentum score difference shift from positive to negative and breach the 1.0 threshold, this would signify an imminent critical turning point in the match. Players are required to make immediate adjustments to their tactical priorities, focusing on strengthening defensive coverage and disrupting opponents’ offensive rhythm through proactive pace variation. It is imperative that coaches employ timeouts expeditiously to facilitate a tactical reassessment, thereby assisting players in the reconstruction of their psychological resilience and the formulation of targeted counterstrategies.

4.2.6. Validation of the Match Trends Prediction Model

This study employs the model for the analysis of multiple other matches, with the objective of evaluating the model’s predictive accuracy and the validity of the results. This is achieved by comparing the predicted fluctuations in players’ scoring advantages with the actual values. To illustrate this, Figure 4 compares the predicted and actual values for the match data coded as 1314.

As demonstrated in Figure 4, the predicted values are in close proximity to the actual values. The model calculations yield an MSE value of 0.2034 and an RMSE value of 0.4510, indicating good model accuracy. Furthermore, the cumulative momentum score difference was calculated based on the prediction results. The analysis revealed that the turning points of the match trends occurred at the 5th, 11th, 13th, 15th, 40th, 47th, 59th, and 64th points. The cumulative momentum score difference for the entire match was 47.121, indicating that Player P1 had an advantage throughout the match. This outcome is consistent with the observed results, thereby validating the efficacy of the predictive models employed.

In addition, to further validate the universal applicability of the model, representative multi-surface and multi-gender match data from the tennis_MatchChartingProject dataset [22] were selected for extended verification. Specifically, data from the 2024 hard court event (United Cup men’s category), clay court event (Maia CH men’s category), and Wimbledon women’s singles match data were incorporated, thus representing three typical scenarios: hard court, clay court, and women’s competitions, respectively. The findings of the test results demonstrated that the MSE and RMSE for the hard court event test set were 0.270261 and 0.519866, respectively; the clay court event achieved an MSE of 0.286727 and an RMSE of 0.535469; and the women’s grass court event yielded an MSE of 0.201481 and an RMSE of 0.448866. The error metrics for the three tests remained at low levels, thus confirming the model’s stable predictive performance across diverse court types, competition tiers, and athlete demographics. This validates the universal applicability of the momentum analysis framework.

5. Conclusions

A system was constructed for the quantification of momentum in tennis players, with the objective of calculating the momentum scores of players in order to reflect critical moments and turning points in matches. The verification results were found to be consistent with actual match scenarios;
The present study sought to explore the influence mechanism of player momentum on match trends, by means of a comprehensive analysis of momentum scores derived from momentum quantification. The findings demonstrate that, within a 95% confidence interval, the cumulative momentum scores for each game exhibit a high correlation with the win–loss changes of players in that game, while the momentum scores for each point show a weaker correlation with the win–loss changes at that point. The correlation analysis results indicate that cumulative momentum can, to a certain extent, reflect match trends. The success of players in matches is not entirely random, suggesting that player match strategies can be optimised using momentum;
A player match strategy optimisation model based on momentum scores was constructed, with the objective of optimising real-time match strategies from two major aspects: player performance scores and match trends prediction. The performance scores calculated using the entropy weight-TOPSIS model are capable of reflecting changes in the player’s state in real time, while the BP neural network model is capable of predicting changes in match trends. The integration of these two approaches enables coaches and players to dynamically adjust their strategies during the match. For instance, when the performance score indicates that the player is in an advantageous position, the player can adopt a more aggressive offensive strategy to consolidate the advantage by combining the trends prediction model to assess the likelihood of a counterattack from the opponent. Conversely, when the performance score indicates that the player is at a disadvantage, the player can adopt a more robust defensive strategy to identify opportunities to reverse the situation by combining the trends prediction model to assess the opponent’s weaknesses. Furthermore, by monitoring the changes in performance scores and trends predictions in real time, players can make more precise tactical adjustments at critical moments, such as break points and match points, thereby increasing their chances of winning. This integrated analysis method not only enhances the real-time and scientific nature of match analysis, but also provides players and coaches with more intuitive and actionable strategy optimisation recommendations.

6. Future Work

The momentum analysis framework for tennis players’ match strategies established in this study provides a foundation for subsequent research, yet there remains room for expansion and optimisation. Subsequent research will concentrate on the following aspects.

With regard to the application of the framework in more extensive scenarios, it is possible to consider a generalisation of the framework to other individual ball sports and team sports. In team sports characterised by complex athlete interactions, future efforts may attempt to represent “momentum” as variable-vector Markov chains to precisely capture individual athlete states. Concurrently, suitable time-lagged intervals could be ascertained by analysing rhythm patterns that are unique to various team sports, thus facilitating a more profound examination of dynamic inter-athlete influences and the development of more sophisticated momentum models.

With regard to the optimisation and comparative analysis of models, the BP neural network model offers potential for enhancement. Through the exploration of diverse network architectures and the comparison of multiple training algorithms, the model’s predictive accuracy and generalisation capabilities may be enhanced.

Author Contributions

Conceptualization, S.W. and M.D.; methodology, S.W.; software, Z.S.; validation, S.W., Z.S. and J.W.; formal analysis, J.W.; investigation, Z.S.; writing—original draft preparation, S.W. and J.W.; writing—review and editing, S.W. and M.D.; visualization, Z.S.; supervision, M.D. and C.Z.; project administration, S.W.; funding acquisition, M.D. and C.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the 2024 National College Students Innovation and Entrepreneurship Training Program of China under Grant S202411415084 and the 2023 Undergraduate Education Quality Improvement Plan Construction Project of China University of Geosciences Beijing under Grant AIKC202302.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The authors express their gratitude to COMAP and Jeff Sackmann for their provision of open-source datasets, which have provided valuable data support for this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Meier, P.; Flepp, R.; Ruedisser, M.; Franck, E. Separating psychological momentum from strategic momentum: Evidence from men’s professional tennis. J. Econ. Psychol. 2020, 78, 102269. [Google Scholar] [CrossRef]
Ritzwoller, D.M.; Romano, J.P. Uncertainty in the hot hand fallacy: Detecting streaky alternatives to random Bernoulli sequences. Rev. Econ. Stud. 2022, 89, 976–1007. [Google Scholar] [CrossRef]
Miller, J.B.; Sanjurjo, A. Surprised by the hot hand fallacy? A truth in the law of small numbers. Econometrica 2018, 86, 2019–2047. [Google Scholar] [CrossRef]
Depken, C.A.; Gandar, J.M.; Shapiro, D.A. Set-level Strategic and Psychological Momentum in Best-of-three-set Professional Tennis Matches. J. Sports Econ. 2022, 23, 598–623. [Google Scholar] [CrossRef]
Noel, J.T.; da Fonseca, V.P.; Soares, A. The Use of Momentum-Inspired Features in Pre-Game Prediction Models for the Sport of Ice Hockey. Int. J. Comput. Sci. Sport 2024, 23, 1–21. [Google Scholar] [CrossRef]
Wang, L.; Chen, P. Tennis Game Dynamic Prediction Model Based on Players’ Momentum. Preprint 2024. [Google Scholar] [CrossRef]
Li, Y.; Li, Z.; Luo, T. Real-time win prediction and momentum analysis of tennis matches based on XGBoost and time domain feature extraction. Model. Simul. 2024, 13, 5732–5743. [Google Scholar] [CrossRef]
Goyal, A.; Simonoff, J.S. Hot Racquet or Not? An Exploration of Momentum in Grand Slam Tennis Matches. arXiv 2020, arXiv:2009.05830. [Google Scholar]
Zhong, M.; Liu, Z.; Liu, P.; Zhai, M. Searching for the Effects of Momentum in Tennis and its Applications. Procedia Comput. Sci. 2024, 242, 192–199. [Google Scholar] [CrossRef]
Fang, X. Probabilistic-based Markov chains for behavioral prediction. Appl. Math. Nonlinear Sci. 2024, 9, 1–18. [Google Scholar] [CrossRef]
Wherry, R.J.; Taylor, E.K. The Relation of Multiserial Eta to Other Measures of Correlation. Psychometrika 1946, 11, 155–161. [Google Scholar] [CrossRef] [PubMed]
Rosa, J.C.; Aleman, J.O.; Mohabir, J.; Liang, Y.; Breslow, J.L.; Holt, P.R. The Application of Spearman Partial Correlation for Screening Predictors of Weight Loss in a Multiomics Dataset. OMICS J. Integr. Biol. 2022, 26, 660–670. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Ren, Y.; Ma, X.; Wang, Q.; Ma, Y.; Yu, Z.; Li, J.; Ma, M.; Li, J. Comprehensive evaluation of the working mode of multi-energy complementary heating systems in rural areas based on the entropy-TOPSIS model. Energy Build. 2024, 310, 114077. [Google Scholar] [CrossRef]
Xi, W.F.; Jiang, Q.W.; Yang, A.M. Using stepwise regression to address multicollinearity is not appropriate. Int. J. Surg. 2024, 110, 3122–3123. [Google Scholar] [CrossRef] [PubMed]
Haykin, S. Neural Networks and Learning Machines, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2009. [Google Scholar]
Rivera, J. Tennis Scoring, Explained: A Guide to Understanding the Rules, Tiebreakers, Terms & Points System. Available online: https://www.sportingnews.com/us/tennis/news/tennis-scoring-explained-rules-system-points-terms/7uzp2evdhbd11obdd59p3p1cx (accessed on 12 November 2024).
COMAP. MCM Problem C: Momentum in Tennis. 2024. Available online: https://www.contest.comap.com/undergraduate/contests/mcm/contests/2024/problems/ (accessed on 12 November 2024).
Almonroeder, T.G. Multiple Regression. In Advanced Statistics for Physical and Occupational Therapy; Routledge: New York, NY, USA, 2022; pp. 149–167. [Google Scholar]
Breusch, T.S.; Pagan, A.R. A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica 1979, 47, 1287–1294. [Google Scholar] [CrossRef]
Stock, J.; Watson, M.W. Heteroskedasticity-Robust Standard Errors for Fixed Effects Regression. Econometrica 2008, 76, 155–174. [Google Scholar] [CrossRef]
Snee, R.D. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. J. Qual. Technol. 1983, 15, 149–153. [Google Scholar] [CrossRef]
Jeff, S. GitHub—JeffSackmann/Tennis_MatchChartingProject: Raw, User-Submitted Point-by-Point Data for Pro Tennis Matches. GitHub. Available online: https://github.com/JeffSackmann/tennis_MatchChartingProject (accessed on 1 April 2025).

Figure 1. Technology roadmap.

Figure 2. Line chart comparing quantified momentum scores.

Figure 3. Line chart of comprehensive indicator scores.

Figure 4. Line chart of predicted values and actual values.

Table 1. Data classification table of the dataset.

No.	Classification	Number of Indicators	Indicator Description
1	Basic Match Information	12	match_id, player1/2, set_no, game_no, point_no, etc.
2	Player Performance Statistics	16	p1/p2_ace, p1/p2_winner, p1/p2_double_fault, p1/p2_unf_err, p1/p2_break_pt, etc.
3	Match Event Information	18	p1/p2_sets, p1/p2_games, p1/p2_score, server, p1/p2_distance, etc.

Table 2. Factors that influence momentum.

No.	Symbol	Description
1	$M S_{P 1} (t)$	Momentum score of pl at point t. $M S_{P 1} (0) = 0 .$
2	$P_{s e r v e r_w i n}$	Probability of the server winning a point
3	$P_{b r e a k_p o i n t_w i n}$	Probability of the returner winning a break point.
4	$∆ C P_{12}$	Difference in the current point score between player 1 and player 2 in the current game.
5	$P_{01}$	Probability of a player winning a point after losing the previous one.
6	$P_{11}$	Probability of a player winning a point after winning the previous one.
7	$i$	i = 1 means “player 1 is the server and won the point”; i = 0 represents “else.”
8	$j$	j = 1 means “player is returner and won a break point”; j = 0 represents “else.”
9	$q$	q = 1 means “player lost the previous point”; q = 0 represents “else”.

Table 3. Eta correlation results.

No.	Match	Each Game		Each Point
No.	Match	P1 Eta	P2 Eta	P1 Eta	P2 Eta
1	1304	0.470	0.505	0.209	0.206
2	1406	0.752	0.800	0.310	0.338
3	1502	0.544	0.731	0.144	0.311
4	1601	0.780	0.752	0.274	0.265
5	1701	0.717	0.491	0.208	0.202

Table 4. Spearman correlation results.

No.	Match	Each Game				Each Point
No.	Match	P1 Spearman	P1 Sig.	P2 Spearman	P2 Sig.	P1 Spearman	P1 Sig.	P2 Spearman	P2 Sig.
1	1304	0.600	0.000	0.619	0.000	0.208	0.000	0.206	0.000
2	1406	0.782	0.000	0.807	0.000	0.308	0.000	0.347	0.000
3	1502	0.521	0.000	0.764	0.000	0.135	0.023	0.315	0.000
4	1601	0.768	0.000	0.699	0.000	0.293	0.000	0.229	0.004
5	1701	0.725	0.000	0.611	0.000	0.202	0.000	0.214	0.000

Table 5. Indicator meaning table.

No.	Index Type	Index Name	Normalisation Formula	Description
1	Positive	Momentum	$Z_{i j} = \frac{X_{i j} - X_{m i n}}{X_{m a x} - X_{m i n}}$	Momentum score
2	Positive	Ace		Winning serve
3	Positive	Winner		Winning shot
4	Negative	Double-fault	$Z_{i j} = \frac{X_{m a x} - X_{i j}}{X_{m a x} - X_{m i n}}$	Double miss
5	Negative	Unf-err		Unforced error
6	Negative	Rally-count		The number of strokes back and forth
7	Negative	Dtop		The difference in the distance run between the two players during a point

Table 6. Entropy weight calculation results for P1 and P2.

No.	Index Item	P1			P2
No.	Index Item	Entropy Value e	Utility Value d	Weight (%)	Entropy Value e	Utility Value d	Weight (%)
1	Momentum	0.986	0.014	1.475	0.987	0.013	0.990
2	Ace	0.384	0.616	65.461	0.148	0.852	65.697
3	Winner	0.722	0.278	29.597	0.598	0.402	30.993
4	Double_fault	0.996	0.004	0.387	0.998	0.002	0.120
5	Unf_err	0.975	0.025	2.645	0.978	0.022	1.691
6	Rally_count	0.997	0.003	0.306	0.997	0.003	0.222
7	Dtop	0.999	0.001	0.129	0.996	0.004	0.287

Table 7. Normalised Regression Coefficient Results.

Test NO.	1	2	3	4	5	6	7	8
Absolute ME (×10⁻⁴)	9.27	8.32	8.87	9.12	9.02	8.46	9.41	8.77

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, S.; Diao, M.; Wang, J.; Song, Z.; Zhang, C. A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum. Appl. Sci. 2025, 15, 5624. https://doi.org/10.3390/app15105624

AMA Style

Wu S, Diao M, Wang J, Song Z, Zhang C. A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum. Applied Sciences. 2025; 15(10):5624. https://doi.org/10.3390/app15105624

Chicago/Turabian Style

Wu, Shiqi, Mingguang Diao, Jingwen Wang, Zihan Song, and Chuyan Zhang. 2025. "A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum" Applied Sciences 15, no. 10: 5624. https://doi.org/10.3390/app15105624

APA Style

Wu, S., Diao, M., Wang, J., Song, Z., & Zhang, C. (2025). A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum. Applied Sciences, 15(10), 5624. https://doi.org/10.3390/app15105624

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Study on the Optimisation of Tennis Players’ Match Strategies from the Perspective of Momentum

Abstract

1. Introduction

2. Construction of the Tennis Players’ Momentum Quantification System

2.1. Data Feature Analysis

2.2. Construction of the Dynamic Momentum Quantification System

3. The Mechanism of How Player Momentum Affects Match Trends

3.1. The Association Strength Between Continuous and Categorical Variables

3.2. Consistency Test for the Correlation Between Player Momentum and Match Trends

4. Construction of a Momentum-Based Player Match Strategy Optimisation Model

4.1. Evaluation of Player Match Performance

4.1.1. Weight Allocation for Key Indicators

4.1.2. Calculation of Player Performance Scores

4.2. Prediction of Match Trends

4.2.1. Key Indicators Influencing Match Trends

4.2.2. BP Heteroskedasticity Test

4.2.3. Multicollinearity Test

4.2.4. Sensitivity Analysis

4.2.5. Nonlinear Fitting for Predicting Match Trends

4.2.6. Validation of the Match Trends Prediction Model

5. Conclusions

6. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI