1. Introduction
The Portuguese Public Security Police (PSP) is a comprehensive security force that carries out its activities daily, 365 days a year, within a complex and increasingly demanding security framework. Society expects the Police to be well prepared in many areas, especially in the physical, technical, and psychological dimensions.
It is known that police service is highly complex and unpredictable, as it involves a set of demanding physical tasks, many of which take place in volatile and harmful scenarios, which may at any time compromise the physical and psychological integrity of police officers [
1]. In fact, during operational service, police officers perform a multitude of tasks, many of them of a physical nature, which may require them, for example, to perform actions such as climbing/descending stairs, running, pulling/pushing, overcoming obstacles, and chasing suspects, among others [
2].
Considering the above, factors such as frequency were considered to determine the main physical tasks performed by police officers [
3]. Given the enormous diversity of functions, a literature review was used to narrow down the physical tasks performed by police officers, i.e., overcoming an obstacle at high speed (police chase), transporting a victim (assistance situation), and carrying out an arrest of a suspect of unknown risk using handcuffs, which are three of the physical tasks considered the most frequent during police service [
4].
Thus, regarding running, Wollack and Associates [
5] indicate that it is a task that, on average, does not last more than one minute (11% of the agents’ runs last less than two minutes), with jumps being frequent to avoid, overcome, and/or circumvent obstacles. Regarding the act of jumping, the average height of the obstacles overcome is 0.914 m, while when climbing/climbing, most fences are 1.524 m or less [
5].
Regarding the dragging task, the author concluded that it is typically performed unassisted, over distances of less than 30.48 m, and can consist of dragging or pulling objects and/or people. Regarding the energy cost, in cases where the effort lasts longer than two minutes (e.g., pursuits and use of force), the resistance level is approximately 75 to 90% of maximum capacity. Finally, regarding the use of force, in 75% of arrests, resistance is moderate to strong, and the duration varies between 30 and 120 s, depending on the specific situation [
5].
Given the above, and after reviewing the literature, the tasks described above represent three of the most frequent tasks during police action [
4]. Therefore, it is appropriate to define the three physical police tasks that serve as the basis for this research. In this sense, the tasks are as follows: (i) obstacle crossing; (ii) victim transportation; and (iii) suspect arrest.
Beck et al. [
2] demonstrated that, from a physical perspective, some actions are more physically demanding than others. For example, an action involving a dismounted pursuit of a suspect, overcoming obstacles, and culminating in an arrest (handcuffing) is more demanding than a mere dismounted patrol or even the simple act of driving. However, although most of the time the tasks police officers perform are not physically demanding, given the complexity and unpredictability of police work [
6], law enforcement officers must always maintain the physical and mental conditions required to accomplish their mission.
According to Lockie et al. [
7], there is an increasing concern for the fitness of police officers, because it is considered crucial for their functional performance in carrying out their mission. A good fitness profile seems essential for the functional performance of police officers in exercising their duties [
7,
8,
9], and the conviction that a police officer with good fitness attributes can better react to the adversities inherent to the profession highlights that regular fitness assessments are fundamental to ensure effective functional performance [
2,
8]. However, current fitness tests to evaluate police officers may not accurately reflect operational demands. From the perspective of excellence in police officer activity, it makes sense to confirm that the protocol for assessing the fitness attributes of female and male police officers reproduces the physical demands inherent to the role [
10].
Because of the above, it seems crucial to verify whether there is a significant correlation between performance in fitness tests and police physical tasks. Furthermore, this is particularly relevant since, according to Canetti et al. [
10], from the perspective of excellence in police service, it is appropriate to assess police officers’ fitness attributes to reflect the role’s physical demands.
Given the above, it seems pertinent to identify which fitness attributes best explain performance in carrying out police physical tasks. In addition, we believe that understanding specifically which attributes are most significant in the performance of police physical tasks is a clear scientific and operational opportunity that can help to improve police officers’ recruitment and selection processes, and to ensure that police officers are better prepared to face the challenges of a constantly evolving operational context.
In accordance, the current research aims (i) to evaluate the reliability of three police physical tasks (i.e., fence jump, victim drag, and arrest suspect) and (ii) to identify the fitness attributes that best explain the performance in carrying out police physical tasks. This will allow confirming (or refuting) the hypotheses that (i) the performance in the police physical tasks presents high reliability in terms of reproducibility; (ii) there is a significant correlation between performance in fitness tests and police physical tasks; and (iii) fitness attributes are significant predictors of the performance in carrying out the police physical tasks.
2. Materials and Methods
2.1. Study Design
This study comprises two parts, i.e., (i) a methodological investigation with a longitudinal test–retest design was used to evaluate the protocol’s reliability for assessing police physical tasks (PPTs), and (ii) a cross-sectional study was used to analyze the relationship established between the performance in PPT and fitness profile. The research was conducted with police cadets in the 3rd and 4th years of the 5-year Police Officer Training Course (Higher Institute of Police Sciences and Internal Security, Lisbon, Portugal).
The protocol was carried out in three distinct sessions, separated by a one-week interval (seven days): (i) briefing, informed consent, biosocial and anthropometric assessment, and assessment of PPT (T1); (ii) second moment of assessment of PPT (T2); and (iii) fitness assessment (T3) (
Figure 1).
2.2. Participants
A non-probabilistic convenience sample of 76 (84.4%) police officer cadets (female, n = 14, 63.6%; male, n = 62, 91.2%) from the 3rd and 4th years of the 5-year Police Officer Training Course (Higher Institute of Police Sciences and Internal Security, Lisbon, Portugal) participated voluntarily in this study (see
Table 1). The participants were informed of the objectives of the research and its assumptions, and all agreed to carry out the study by signing the informed consent form. The study was approved by the Higher Institute of Police Sciences and Internal Security (Lisbon, Portugal) and Portuguese Public Security Police (Process No. SECDE202400003ASP of 15 November 2024). It was conducted following the conditions established in the Declaration of Helsinki [
11].
2.3. Procedures and Instruments
The cadets were evaluated during the morning physical education and sports session (90 min) of the class (each class has ~20 participants), and the protocol applied was as follows: (i) T1-biosocial data, anthropometric assessment, and evaluation of PPT; (ii) T2-re-evaluating PPT; and (iii) T3-fitness evaluation.
In the assessment of the participants’ biosocial characteristics (T1), the following variables were considered: (i) sex (female; male) and (ii) age (years).
In the anthropometric assessment (T1), the following were considered: (i) height (m) and (ii) weight (kg). A tape measure was used to measure height, and weight was obtained using a digital scale (TANITA, Dual Frequency Body Composition Monitor, RD-953-BK, Tanita Ltd., Amsterdam, The Netherlands). Complementarily, body mass index was calculated using the following equation: weight (kg)/height (m)
2 [
12]. In both measurements, the participants were barefoot and wearing only shorts and a t-shirt.
The three police physical tasks (
Figure 2) were evaluated twice (T1, test; T2, retest) with the time interval of application being one week (seven days) [
11,
12,
13], as adopted in previous studies with this specific population [
13]. The protocol was applied at the same time (during the same Physical Education and Sports class), at the same location (Sports hall of the Higher Institute of Police Sciences and Internal Security, Lisbon, Portugal), and under similar weather conditions at both moments.
The protocol for evaluating PPT includes the following functions, in the respective order of execution (
Figure 2): (i) fence jump; (ii) victim drag; and (iii) arrest suspect.
The sequence presented was intended to respect the principle of specificity, as the objective is to simulate a scenario that approximates operational reality, where a suspect is chased on foot, overcoming an obstacle, followed by dragging/transporting a victim, ending the assessment with the arrest of a suspect (handcuffing of unknown risk with collaboration and without offering resistance, i.e., passive).
The following criteria are observed when carrying out PPT: (i) the tests are carried out individually, and in the order provided; (ii) before the start of each session and each test, participants have a period of no more than five minutes to prepare for it; (iii) before the start of each test, it will be duly explained and exemplified by the evaluators; (iv) all tests must be carried out strictly following the method of execution presented; (v) in each task, participants have two attempts, with the time of the best execution being counted; (vi) after carrying out each test, participants are informed of their respective results; (vii) each participant must wear a training uniform, approved by the Uniform Regulations for Police Officers of the Portuguese Public Security Police, and a police belt with holster, a weapon (GLOCK pistol, model 19, Glock Perfection, Deutsch-Wagram, Austria), handcuffs (ALCYON, model Steel K-70, Alcyon, Elgoibar, Spain), and a police baton (ABS Baton, model 80 cm, Fox Armor, Deqing City, China).
The objectives, execution order, procedure, scheme, and equipment of the PPT assessment protocol are presented in
Table 2.
In the fitness assessment, the protocol considered the following: (i) 30 m sprint test (2 attempts); (ii) horizontal jump (2 attempts); (iii) handgrip strength (2 attempts); (iv) sit-ups in 60 s (2 attempts); (v) pull-ups (male)/push-ups (female) (2 attempts); (vi) agility test–slalom (2 attempts); and (vii) 20 m shuttle run test (1 attempt). The reported order of the fitness tests assumed a progressive increase in fatigue, with a ten-minute interval between tests.
A summary description of the procedures and instruments used to carry out these fitness tests can be found below, and a detailed description is in the previous work published by Massuça et al. [
14].
Sprint 30 m: To evaluate this test, on a completely flat surface, two photoelectric cells (BROWER, TCi-System FS12656, Power Systems, Draper, UT, USA) were placed at the beginning and two at the end of the course, spaced exactly 30 m apart, to measure the task’s exact time. The participants had to stand behind the starting line and, in a straight line, complete the distance in the shortest possible time. They were allowed to sprint twice, with the best time recorded in seconds.
Horizontal jump: In this test, a measuring tape was placed along the floor. The participants had to put their feet (parallel and approximately shoulder-width apart) behind a predefined line, without ever stepping on it. The test consists of performing a horizontal jump (without a double jump), using only body balance. The distance between the starting position and the mark reached by the support closest to the starting position is recorded. The jump is performed twice, with the best distance recorded in meters.
Handgrip strength: This test was performed using a handgrip strength dynamometer (model TKK 5401 Grip-D, Takei, Japan), which was adjusted to the participant’s hand size. The test consisted of placing the device next to the body in a natural standing position, with the arm and forearm extended, and then squeezing the device (for 3 s) with maximum force. Two attempts were made with each hand, and the best performance (in kg) was recorded. Finally, the performance of both hands was added together [
15].
Sit-ups: The participants were asked to lie on the floor supine with their knees bent at ~90°, feet shoulder-width apart (fixed by an external participant), and hands overlapped behind their head (the back of their neck). After the beep, the participants had to bring their elbows to or beyond the imaginary knee line, returning to the starting position (with their shoulder blades completely touching the floor). The number (n) of correct sit-ups (during the 60 s of the test) was recorded, and participants may repeat the test if they wish [
16].
Pull-ups: This test, exclusively for male cadets, consists of performing the maximum number of pull-ups on a horizontal bar, placed approximately 2.50 m above the ground. Performers must jump and remain suspended from the bar, without ground support, and only begin the exercise on command (after being entirely suspended, without swinging the body). With hands pronated and shoulder-width apart, arms fully extended, perform the pulling movement without moving the legs, until the chin passes the bar, then return to the starting position—this movement corresponds to an execution. Successful executions were counted (n), and participants may repeat the test if they wish [
17].
Push-ups: This test, exclusively for female cadets, requires participants to position themselves in a plank position (prone), with only their feet and hands resting on the ground (hands approximately shoulder-width apart and fingers pointing forward). Each repetition (n) is valid if the participant, after flexing the elbow, touches the chest against the wooden board (supported by the ground) and returns to the starting position. The participants were allowed to rest in the starting position if the plank position was maintained [
18]. Successful executions were counted (n), and participants may repeat the test.
Agility test (slalom): To perform this test, the participants were asked to complete a running course with several changes in direction (slalom). The total course was 88.10 m, with the distance between the two furthest points being 13 m. The participants began the race at the signal’s sound and had to complete the test in the following sequence: a straight out-and-back course; an out-and-back slalom course; and a straight out-and-back course. Two opportunities were granted, and the best time was recorded in seconds.
The 20 m shuttle run: Two marks were placed on the ground exactly 20 m apart. To complete the test, the participants had to walk this distance and pass the mark with both feet, only beginning a new route after hearing the corresponding beep. The beeps were pre-recorded, starting at 8.5 km/h and gradually increasing by 0.5 km/h for each level reached (corresponding to 7/8 routes). The test ends when the participant gives up or fails to reach the 20 m mark before the beep for the second time (consecutive or not), and the number of routes completed (n) is recorded [
16]. This test cannot be repeated. In addition, the
VO
2max (ml/kg/min) was estimated using the method proposed by Duarte and Duarte [
19].
2.4. Statistical Analysis
Descriptive statistics are presented through measures of central tendency (mean, M) and dispersion (standard deviation, SD).
To ensure reliability, the PPT assessment protocol was applied on two separate occasions (test–retest) over a period of one week, as highlighted in the literature [
20,
21,
22] and adopted in previous studies with this specific population [
13], and (i) a paired t-test was also performed between PPT trials to test whether the error differed significantly from zero; (ii) the bivariate correlations between the difference (T2–T1) and the mean [(T2 + T1)/2] were used to identify proportional bias; (iii) Cohen’s d effect size was also calculated [
23]; (iv) the intraclass correlation coefficient (ICC: 2.1) and its 95% confidence interval (CI) were used [
24] to analyze test–retest reliability according to Liljequist et al. [
25]; and (v) the Bland and Altman plots [
26] were presented.
In the second part of the study, the average value of the two PPT assessments (T1 and T2) was considered as follows: ((T1 + T2)/2). Both graphical (histograms, box plots, and Q-Q plots) and analytical methods (Shapiro–Wilk test) were used to test the normal distribution of data, and no significant results were observed. In accordance, (i) the significance of the differences between female and male cadets was assessed using Student’s t-test for independent samples; (ii) Pearson’s correlation coefficient was used to measure the intensity and direction of the association between performance in PPT and fitness tests (separately for each sex); and (iii) multiple linear regression (stepwise method) was used to obtain a parsimonious model that would allow predicting performance in each PPT under study based on fitness attributes.
A type I error probability (alpha) of 0.05 was considered for all the analyses, and all the statistical analyses were performed using the JASP software (JASP 0.18.3 (Apple Silicon), University of Amsterdam, Amsterdam, The Netherlands [
27].
3. Results
The reliability study of PPT showed that the average execution time in: (i) fence jump was 3.55 s in the first assessment (test) and 3.59 s in the second assessment (retest), translating into an average difference of 0.05 s (ICC = 0.88); (ii) victim drag was 5.72 s in the first assessment and 5.65 s in the retest, with the difference between assessments being −0.06 s (ICC = 0.92); and (iii) arrest suspect was 39.60 s in the first assessment, and 38.40 s in the second assessment, with the average difference being −1.21 s (ICC = 0.81).
Regarding the standard mean error (SE), it was found that in all PPT, it was less than one, with the arrest suspect recording the highest value (0.41 s), and the fence jump recording the lowest value (0.04 s). The results are presented in
Table 3.
The Bland–Altman test assesses the agreement between the two assessments (T1 and T2) for the three PPT. Thus, (i) regarding obstacle crossing, the confidence limits are relatively close, indicating low variability between assessments (given that performance on this task is relatively stable between T1 and T2, it is possible to conclude that there is a consistent relationship between the assessments); (ii) regarding victim transportation, the graphs show a greater dispersion in the differences, with greater instability visible (there are even some records that exceed the 95% confidence limits, suggesting greater variability between the T1 and T2 assessment times for this task); and (iii) in the task involving arresting a suspect, given the dispersion of the scores, greater variability is observed between T1 and T2, making it the task with the widest confidence limits.
Graphical representations of the Bland–Altman test are shown in
Figure 3.
Regarding the performance of the male and female cadets in PPT and fitness tests (
Table 4), it was observed that male cadets were significantly (i) faster (in 30 m), more agile, stronger, and had superior aerobic capacity (all,
p < 0.001); and (ii) faster in fence jump and victim drag (
p < 0.001).
The study of the association between performance in PPT and fitness attributes in female cadets showed significant correlations between (i) fence jump and performance in the 30 m sprint (
r = 0.856,
p < 0.001), agility (slalom,
r = 0.752,
p < 0.01), horizontal jump (
r = −0.716,
p < 0.01), and push-ups (
r = −0.787,
p < 0.001); (ii) victim drag and performance in sit-ups (
r = −0.740,
p < 0.01) and handgrip (
r = −0.565,
p < 0.05); and (iii) arrest suspect and performance in agility (slalom,
r = 0.593,
p < 0.05) and horizontal jump (
r = −0.670,
p < 0.01). In male, significant correlations were observed between (i) fence jump and performance in 30 m sprint (
r = 0.440,
p < 0.001), agility (slalom,
r = 0.393,
p < 0.01), horizontal jump (
r = −0.402,
p < 0.01), pull-ups (
r = −0.315,
p < 0.05), and 20 m shuttle run test (
r = −0.294,
p < 0.05); and (ii) victim drag and performance in the handgrip (
r = −0.327,
p < 0.05). It was also observed that there were no significant correlations between the arrest suspect task and performance in the studied fitness tests. The results are presented in
Table 5.
In the last analysis, the multiple linear regression revealed that (i) in female cadets, 30 m sprint (beta = 0.856;
t = 5.741;
p < 0.001) was a significant predictor of performance in fence jump task (
F(1,12) = 32.958;
p < 0.001;
R2a = 0.711), sit-ups (beta = −0.740;
t = −3.811;
p = 0.009) was a significant predictor of performance in victim drag task (
F(1,12) = 14.522;
p = 0.002;
R2a = 0.510), and horizontal jump (beta = −0.670;
t = −3.125;
p = 0.009) was a significant predictor of performance in arrest suspect task (
F(1,12) = 9.768;
p = 0.009;
R2a = 0.403); and (ii) in male cadets, performance in horizontal jump (beta = −0.418;
t = −3.662;
p < 0.001) and in shuttle run 20 m (beta = −0.257;
t = −2.248;
p = 0.028) were significant predictors of performance in fence jump task (
F(1,57) = 5.053;
p = 0.028;
R2a = 0.238), and handgrip (beta = −0.327;
t = −2.677;
p = 0.010) was a significant predictors of performance in victim drag task (
F(1,60) = 7.165;
p = 0.010;
R2a = 0.092). The results are presented in
Table 6 and
Table 7.
4. Discussion
In this study, three of the main PPTs performed by police officers in exercising their duties were considered. A literature review was conducted to reduce the extensive list of PPTs to just three, considering key factors such as frequency and relevance for the police mission [
3], and PPTs that could be performed inside a Police Academy sports hall by a large group of participants. Thus, the selected PPTs are the following: (i) fence jump; (ii) victim drag; and (iii) arrest suspect [
4].
The study of these three PPT showed that (considering the two evaluation moments, i.e., T1 and T2): (i) victim drag and arrest suspect tasks presented better performance values in the retest (T2); (ii) fence jump presented better results in the test (T1); (iii) the most tremendous difference (T2–T1) was recorded in arrest suspect task, while the smallest was observed in fence jump task; and (iv) the three PPT presented mean standard error values <1, suggesting their viability and reliability [
28].
Regarding the reference values for the intraclass correlation coefficient (ICC), which allow us to demonstrate the reliability of a study, we found that the literature diverges on this point [
22]. However, according to Cormack et al. [
29], the reliability of a protocol is considered acceptable when it presents an ICC greater than 0.85.
In the current research, only the victim drag task presents an ICC above 0.90. In the remaining two PPTs, the ICC is 0.88 and 0.81 for fence jump and arrest suspect, respectively. In other words, all ICC values are above 0.75, which ensures, at the very least, good reliability [
25]. However, it is essential to highlight that, given the dispersion of scores, the task involving arresting a suspect shows greater variability between T1 and T2 (ICC = 0.81, below the 0.85 threshold cited in Cormack et al. [
29]), making it the task with the widest confidence limits. This observation suggests that performance on this task varies significantly between individuals and may be influenced by several factors, including technique, strength level, preparation, and motivation to perform the task. Therefore, further study of this task requires further investigation to identify the covariates behind this observation.
Considering the average, ICC of the PPTs analyzed is 0.87, which is an acceptable reliability according to Cormack et al. [
29]. According to Liljequist et al. [
25], the protocol can be classified as having good reliability. However, it is essential to highlight that this ICC value falls between 0.75 and 0.89. Therefore, since we are close to the maximum limit, almost crossing over to the upper level of excellent reliability (ICC between 0.90 and 1.00), we can infer that the protocol in question has high reliability.
Nevertheless, it is essential to understand the specificities of police activity, which vary depending on the context and the tasks that naturally underlie them [
8]. From this perspective, Anderson et al. [
30] showed that the predominantly sedentary nature of police activity can lead police officers to not give due importance to fitness training, given that 80 to 90% of a police officer’s work involves physical activity considered limited or not very demanding from a physical point of view. However, Bissett et al. [
31] warn that, although activities that require physical skills may be less frequent, their importance is often considered critical in achieving the success of the police mission. By this, adequate preparation to deal effectively with the different situations of everyday life seems entirely justified, given the unpredictability, complexity, and physical demands of police work [
6]. The interplay between sedentary tasks and high-intensity physical demands presents officers with unique health and performance challenges. This contrast can significantly strain the cardiovascular system (increased risk factors for cardiovascular disease) [
32,
33]. It can lead to fatigue (decreased physical fitness), reduced work capacity, and impact performance.
Regarding the performance in PPT, no comparisons were made, since this is a pioneering study on the subject in the national context, and there are no PPTs carried out in the international literature in the same format, which is why this topic assumes scientific and operational relevance.
In continuation, this study also showed that male cadets performed better than women in all studied PPT, and which are also faster in 30 m, more agile, stronger, and superior in aerobic capacity than female cadets, given that male perform better in all variables of the fitness tests, which corroborates the results of the study of Massuça et al. [
14].
Male cadets tend to have greater skeletal muscle mass, a higher proportion of fast-twitch (type II) muscle fibers [
34], and higher
VO
2max (maximal oxygen uptake) [
35] compared to females. This physiological advantage may explain the observed differences, but it is essential to emphasize that factors such as training, nutrition, lifestyle, and individual variation can also influence individual performance levels. Considering the observed results, adjusting fitness benchmarks for females might be necessary. This highlights the current paradigm, i.e., should police fitness standards be gender-neutral [
36,
37,
38] or adjusted [
39]? The debate is ongoing (see Lockie et al. [
40]). While gender-neutral standards promote equality and may improve overall fitness, adjusted benchmarks may be necessary to account for physiological differences and ensure police officers have the physical capabilities required for specific tasks. The key is to establish standards that are validated, relevant to the job, and applied fairly to all individuals.
Comparing the fitness profile of the participants in this study with those of previous studies with Portuguese police officers and cadets, it is highlighted that female cadets showed (i) in horizontal jump, a value higher (+0.03 m) than that found by Freitas et al. [
13], and slightly lower (−0.02 m) than that found by Massuça et al. [
14]; (ii) in handgrip strength (left + right), a value higher (+1.94 kg and +1.59 kg) than those observed by Massuça et al. [
14] and Freitas et al. [
13], respectively; (iii) in 60 s sit-ups, a superior performance (+2.41 repetitions) that observed by Freitas et al. [
13]; and (iv) in 20 m shuttle run, an inferior performance (-3.35 shuttles) to that observed by Massuça et al. [
14], and a superior performance (+3.17 shuttles) to that observed by Freitas et al. [
13]. Concerning male cadets, the following were observed: (i) in horizontal jump, an inferior performance (−0.05 m) to that observed by Massuça et al. [
14], and a superior performance (+0.03 m) than observed by Freitas et al. [
13] and; (ii) in handgrip strength (left + right), a substantially inferior performance (−6.23 kg) to that observed by Massuça et al. [
14], and a superior performance (+5.15 kg) to that observed by Freitas et al. [
13]; (iii) in 60 s sit-ups, a superior performance (+7.29 repetitions) to that observed by Freitas et al. [
13]; and (iv) in 20 m shuttle run, an inferior performance (−4.65 shuttles) to that observed by Massuça et al. [
14], and a superior performance (+8.19 shuttles) to that observed by Freitas et al. [
13]. In sum, the present study revealed superior fitness performances (in horizontal jump, handgrip strength, sit-ups, and 20 m shuttle run) compared to the recent research carried out by Freitas et al. [
13].
Complementarily, it was observed that female cadets showed a significant correlation between their performance in (i) the fence jump task and 30 m sprint, agility, horizontal jump, and push-ups; (ii) the victim drag task and sit-up and handgrip strength; and (iii) the arrest suspect task and agility and horizontal jump. Also, in male cadets, significant correlations were observed between their performance in (i) the fence jump task and 30 m sprint, agility, horizontal jump, pull-ups, and 20 m shuttle run; and (ii) the victim drag task and handgrip strength. Nevertheless, in the arrest suspect task, no significant correlations are evident with performance in the fitness tests.
Our results showed that some fitness tests (e.g., upper-body strength) did not correlate with tasks like victim drag. Studies have shown that upper-body strength tests, like push-ups, sit-ups, and pull-ups, do not strongly correlate with the time it takes to perform a victim drag [
41,
42]. This means that someone who scores well on these tests might not necessarily be better at dragging a simulated victim, i.e., this implies that other factors, possibly lower-body strength or power, might be more influential in determining success in such tasks [
41].
Research also suggests that while fitness is essential for overall police performance, it does not directly correlate significantly with the outcome of arrest situations [
43] due to the multifaceted nature of the task (technique, situational awareness, and decision-making under pressure). This emphasizes that the context of the arrest scenario is crucial, highlighting the study by Henze et al. [
44] who revealed a positive correlation between fitness attributes and performance in a use-of-force and arrest simulation test, suggesting that (i) when the arrest simulation involves physical confrontation and use of force, physical fitness plays a more significant role, and (ii) when the arrest is more about control and restraint, other factors may become more important.
Since the relationship between performance in the PPT under analysis and fitness tests has been fully met, it seems relevant to identify which fitness attributes best explain the performance in executing the studied PPT.
The statistical models estimated for females explain a reasonable percentage of performance in PPT, with an average of 54.1%. In detail, (i) the highest value was found in the fence jump task (71.1%); (ii) in the victim drag task, the value of 51%; and (iii) in the arrest suspect task, the lowest value was observed (40.3%). Complementarily, this study showed the fitness attributes that best explain female cadets’ performance in PPT, i.e., (i) in the fence jump task, the 30 m sprint; (ii) in the victim drag task, the 60 s sit-ups; and (iii) in the arrest suspect task, the horizontal jump. As far as male cadets are concerned, the models presented explain a very low percentage of performance in PPT, with values ranging from 9.2% to 23.8%. The evident poor predictive utility suggests missing covariates (e.g., technique or anthropometrics), so future studies should invest in identifying these attributes.
In male models, it was observed that the most relevant fitness attribute in the fence jump task was horizontal jump (followed by the 20 m shuttle run), and handgrip strength in the victim drag task. Nevertheless, no attribute stands out or is worthy of mention in the arrest suspect task.
In the comparison between the female and male cadets, (i) a tremendous difference in performance was observed in the fence jump model, with a total of 47.3% (71.1% for female, and 23.8% for male); (ii) the most minor difference is 41.8% in the victim drag model (51% for female, and 9.2% for male); and (iii) no comparison was made in the arrest suspect task, since only the value of 40.3% was recorded for female cadets model. In sum, upper limb strength (push-ups in females and pull-ups in males) and agility do not explain performance in the studied police physical tasks. However, it seems essential to highlight that (i) in female cadets, speed (30 m sprint) seems to be the most relevant attribute, followed by horizontal jump and sit-ups; and (ii) in male cadets, horizontal jump seems to be the most decisive attribute, followed by handgrip strength and the 20 m shuttle run.
Despite the relevance and contributions of the ongoing research, it is essential to acknowledge some limitations that may have influenced the results obtained, which should be considered in their interpretation. Firstly, the fact that the sample was exclusively composed of cadets who are attending the Police Officer Training Course (at Higher Institute of Police Sciences and Internal Security, Lisbon, Portugal) may, to a certain extent, limit the generalization of the results to a broader universe of active police officers, whose age, professional experience, and level of fitness profile may vary significantly. Secondly, the sample’s composition showed an imbalance between genders, with a lower representation of female participants. Third, the use of a dummy as a simulation of a “human victim” represents a limitation in terms of ecological validity, in the sense that it does not faithfully reproduce some aspects that should be considered, such as the distribution of body weight, passive resistance, or physical interaction that a real victim would imply. Furthermore, regarding the dummy used in the study (i.e., 80 kg), it is essential to note that this is the reference weight adopted by other national tactical institutions (e.g., firefighters). However, we also believe that the weights currently used in training and evaluation dummies should be reevaluated, considering the increase in the average body mass of the population [
42,
45,
46]. Lastly, assessing the fitness level after the PPT assessment may have impacted the results (however, all academic semesters must end with the physical fitness assessment).
In accordance, for future research, it would be relevant to extend the research to active police officers, outside the training context, with different levels of seniority and professional experience, to verify the applicability and reliability of the studied PPT in authentic contexts and at various stages of the police career. It is also recommended to include more balanced samples in terms of gender distribution, which will allow for a more robust analysis of gender differences and promote a more inclusive and equitable approach in developing physical tests adapted to functional demands. Finally, it would be pertinent to compare the results obtained with other police forces, allowing us to assess whether the physical needs and task performance are consistent across different operational contexts.
Finally, it is essential to summarize that this study showed that (i) cadets’ performance in the PPT assessment presents high reliability (ICC = 0.87) regarding reproducibility, indicating that it is a reliable protocol to be applied to police officers; (ii) there is a tendency for the best performances to be observed in the retest (T2); (iii) male cadets presented better results in PPT (in all assessment moments) than female cadets; (iv) male cadets were significantly faster, more agile, stronger, and superior in aerobic capacity than female cadets; (v) male cadets were significantly faster in fence jump and victim drag tasks compared to female cadets; (vi) for female cadets, performance in the 30 m sprint, sit-ups, and horizontal jump were predictors of fence jump, victim drag, and arrest suspect tasks, respectively; and (vii) for male cadets, performance in the horizontal jump and the 20 m shuttle run were predictors of performance in the fence jump task, while handgrip strength was a significant predictor in the victim drag task. In short, within the scope of the attributes predicting performance in the PPT under study, the explosive strength of the lower limbs constitutes a relevant attribute for the cadets’ performance in the PPT, regardless of gender. In summary, this study attests to the relevance of implementing the PPT assessment, as a complement to the current fitness assessment, allowing a more reliable assessment of the operational demands and contributing to greater efficiency, motivation, and readiness of the police work.