Operational Cycle Detection for Mobile Mining Equipment: An Integrative Scoping Review with Narrative Synthesis
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper is devoted to a review of methods for detecting operating cycles of mobile mining equipment, including diesel and battery-powered machines. The authors systematized studies of the last twenty-five years using the PRISMA-ScR protocol and identified 20 relevant works, as well as compared them with similar studies in related fields. It is shown that accurate determination of machine operating modes is key to predictive maintenance, performance optimization and energy management, especially in the context of the industry's transition to battery-powered dump trucks and loaders. The novelty of the study lies in creating a methodological map of existing approaches and assessing the transferability of solutions developed for electric vehicles to the mining context. The work identifies gaps - the lack of open datasets, disparate evaluation criteria and limited examples of real-time validation. The significance of the study is that it forms the basis for the development of reliable cycle recognition algorithms, which is critical for the implementation of next-generation electric mining machines. The article needs some revision before publication:
1. When describing cycle classification methods, the authors consistently group them into families (threshold methods, machine learning, deep networks, semi- and unsupervised approaches). This structure is convenient, but it is not always clear how they compare the efficiency of different families with each other. The authors list accuracies and F1 metrics, but a real comparison with identifying the strengths and weaknesses of different approaches applicable to the same problem is clearly lacking.
2. The authors highlight the problem of the lack of open datasets as the main obstacle to progress. This is a fair remark, but the analysis is limited to a statement of fact. It would be useful to consider which data formats and exchange protocols could be standardized. A comparison with the automotive industry, where there are large open datasets, is made, but an analysis of which specific elements of these practices can be transferred to mining is not carried out.
3. The review introduces the concept of "Transferability Lens" to assess the transferability of methods. This is an interesting and innovative tool. However, its application turned out to be more illustrative than rigorous. The authors describe three axes (transfer distance, data offset, context), but do not offer quantitative metrics or clear criteria. Therefore, comparing different methods through this lens seems subjective.
4. The review describes the data preprocessing stages (smoothing, filtering, feature selection) quite thoroughly. However, the authors should have made a more detailed comparison of which preprocessing techniques critically affect the accuracy of the algorithms. Now we can see a list of methods, but there is no analysis of where exactly the gain from preprocessing is greater: in classical or in deep models. Such a comparison would have allowed us to better understand what is more important - data or a model.
5. The authors point out the limitations of real-time validation, which is true and important. However, they do not analyze which methods are closest to practical implementation, and why. Some algorithms are declared as potentially suitable, but without a comparison by the criteria of processing delay, energy consumption and reliability, it is difficult to judge which directions can actually be implemented. Here, the review remains theoretical.
6. The section on non-mining EVs is interestingly written, but it feels like the authors used it more as a gap filler. The comparison with mining equipment seems partial: slopes or load are taken into account, but important differences are not analyzed, such as the impact of dust, vibrations, and limited underground space. Therefore, the comparison is rather superficial and does not provide a full understanding of the real transferability of the methods.
7. The article identifies gaps in the data on LHDs (load-haul-dump trucks). The authors honestly admit that most of the studies concern dump trucks. However, they did not analyze which methods are worst transferred to LHDs and why. A comparison at this level would have made the review stronger: to show what exactly hinders the algorithms from working on other equipment - noise, dataset sizes, or different operating profiles.
8. A strong point of the review is the attention to the physical features of BEVs - for example, regenerative braking. The authors rightly point out that this distorts the cycle signatures. However, it is not shown how different methods cope with this. Are there algorithms that are better at distinguishing such effects? The comparison is made more at the level of a statement than at the level of comparing practical efficiency.
9. The authors touch on the topic of explainable AI and note its absence. This is an important observation, but it is presented as a short remark. It would be useful to compare which approaches are generally more interpretable (e.g. threshold rules vs. deep networks). Such a comparison would add value, since transparency of algorithms is critical for industry.
10. The conclusion of the article correctly emphasizes the importance of developing reliable algorithms for BEVs. However, the authors do not compare the findings of the review with the practical needs of mining companies. It would be useful to compare which methods are better suited for predictive maintenance, which for energy optimization, and which for shift planning. Such a comparison by tasks would make the review much more valuable for practitioners.
Author Response
Comments 1: "When describing cycle classification methods, the authors consistently group them into families (threshold methods, machine learning, deep networks, semi- and unsupervised approaches). This structure is convenient, but it is not always clear how they compare the efficiency of different families with each other. The authors list accuracies and F1 metrics, but a real comparison with identifying the strengths and weaknesses of different approaches applicable to the same problem is clearly lacking."
Response 1: We partially agree with the suggested edit. While a comparison of different algorithm families strengths and weaknesses is useful, a quantitative ranking of such families is only possible if they are applied to the same dataset (same mine site, logged at the same time interval). Specifically, “Cross-family strengths and weaknesses” synthesizes efficiency trade-offs and typical failure modes across Rules/Thresholds, HMM/HSMM, Change-point, Clustering, Trees/RF, and CNN/LSTM. To make these contrasts explicit, Table 4 lists each family’s strengths, weaknesses, best-fit signal/label conditions, and common errors, while Table 5 aligns the same families to downstream operational tasks (predictive maintenance, energy optimization, shift KPIs). This builds directly on the bridge established in 2.5 (sensor bandwidth and label density mapped to model families) and is reinforced by empirical contrasts in sections 3.2.4 (diesel trends and limits) and 3.2.5 (truck-to-LHD transfer). Together, these additions ensure the efficiency and applicability of method families are compared head-to-head rather than only by reported metrics.
---
Comments 2: "The authors highlight the problem of the lack of open datasets as the main obstacle to progress. This is a fair remark, but the analysis is limited to a statement of fact. It would be useful to consider which data formats and exchange protocols could be standardized. A comparison with the automotive industry, where there are large open datasets, is made, but an analysis of which specific elements of these practices can be transferred to mining is not carried out."
Response 2: We agree that family-level contrasts are most useful when framed by their operating conditions. We added a consolidated analysis in §4.2.1 and two supporting tables. Table 4 lists each family’s strengths, weaknesses, best-fit signal/label conditions, and typical failure modes; §2.5 (final paragraph) provides the upstream bridge from signal bandwidth and label density to model families; and §3.2.1–3.2.5 supply empirical contrasts (e.g., CNN vs rules on loaders vs trucks). We then map families to downstream operational tasks in §4.2.4 with Table 5. Together, these sections compare methods head-to-head by fit and failure mode rather than only reported metrics.
---
Comments 3: "The review introduces the concept of "Transferability Lens" to assess the transferability of methods. This is an interesting and innovative tool. However, its application turned out to be more illustrative than rigorous. The authors describe three axes (transfer distance, data offset, context), but do not offer quantitative metrics or clear criteria. Therefore, comparing different methods through this lens seems subjective."
Response 3: We agree with this comment; the Transferability Lens is too subjective to be applied systematically to the selected papers, and would be difficult to use either on other similar articles, or if attempts at replication are made. Therefore, the Lens, and its application to certain automotive BEV operational cycle detection articles, were completely removed from the article.
---
Comments 4: "The review describes the data preprocessing stages (smoothing, filtering, feature selection) quite thoroughly. However, the authors should have made a more detailed comparison of which preprocessing techniques critically affect the accuracy of the algorithms. Now we can see a list of methods, but there is no analysis of where exactly the gain from preprocessing is greater: in classical or in deep models. Such a comparison would have allowed us to better understand what is more important - data or a model."
Response 4: We agree that the question matters; howeverwe have now clarified why a head-to-head quantification is not methodologically defensible, and is out of scope for a scoping review. In "Research Gaps and Limitations", we note that the corpus lacks within-study ablations, pipelines and hyperparameters are heterogeneous, sensors/label densities and metrics vary across studies, and external validity is thin; any cross-paper “data vs. model” effect size would therefore be confounded. We flag this explicitly as a limitation and, instead of offering speculative rankings, propose a standardized ablation protocol as future work in the same section.
---
Comments 5: "The authors point out the limitations of real-time validation, which is true and important. However, they do not analyze which methods are closest to practical implementation, and why. Some algorithms are declared as potentially suitable, but without a comparison by the criteria of processing delay, energy consumption and reliability, it is difficult to judge which directions can actually be implemented. Here, the review remains theoretical."
Response 5: We agree, and the practical implementation analysis has now been provided. See "Real-time readiness: latency, compute, and robustness", which compares method families by processing delay (sub-cycle <100 ms, intra-cycle 100-500 ms, inter-cycle 0.5-5 s), compute/energy footprint (e.g., trees/RF ≈3×10⁻⁵ s per sample), and reliability practices (debouncing/short-segment suppression, explicit uncertain states, and resilient-channel fallbacks). The section identifies which families routinely meet sub-cycle budgets on commodity CPU, where high-rate pipelines shift energy/compute costs to feature extraction, and when compression or tiny models are required.
---
Comments 6: "The section on non-mining EVs is interestingly written, but it feels like the authors used it more as a gap filler. The comparison with mining equipment seems partial: slopes or load are taken into account, but important differences are not analyzed, such as the impact of dust, vibrations, and limited underground space. Therefore, the comparison is rather superficial and does not provide a full understanding of the real transferability of the methods."
Response 6: We respectfully disagree. The manuscript’s scope is a scoping review of duty-cycle classification; experimentally analyzing the effects of dust, vibration, and confined underground geometry is beyond scope and would require controlled, paired datasets that are not present in the corpus. The non-mining EV section is included to summarize transferable ideas while explicitly limiting claims about underground applicability. Moreover, forklifts, loaders, and buses are used in mining fleets, so these exemplars are not purely out-of-domain.
---
Comments 7: "The article identifies gaps in the data on LHDs (load-haul-dump trucks). The authors honestly admit that most of the studies concern dump trucks. However, they did not analyze which methods are worst transferred to LHDs and why. A comparison at this level would have made the review stronger: to show what exactly hinders the algorithms from working on other equipment - noise, dataset sizes, or different operating profiles."
Response 7: We agree, and this comparison has now been provided. See section 3.2.5 "LHD vs. Truck Transferability" for a focused analysis of what degrades on LHDs and why, with concrete contrasts (e.g., CNN 93% on trucks vs. 58% on LHDs; hydraulic smoothing 96% for full/empty but 67% for loading) and causal factors (shorter, more variable phases; digging operations; operator-style imprint; fragile/noisy hydraulic channels; non-hydraulic cues failing to separate loading vs. unloading). Contributing context is detailed in "Diesel Specific Trends and Key Limitations", and the method-family failure modes are synthesized in "Algorithm families in mining cycle detection".
---
Comments 8: "A strong point of the review is the attention to the physical features of BEVs - for example, regenerative braking. The authors rightly point out that this distorts the cycle signatures. However, it is not shown how different methods cope with this. Are there algorithms that are better at distinguishing such effects? The comparison is made more at the level of a statement than at the level of comparing practical efficiency."
Response 8: We partially agree, and the manuscript now makes this practical contrast explicit. See "Unique BEV challenges", where regenerative braking is identified as a source of distorted cycle signatures, and "Mining-specific strengths and weaknesses", where we indicate which families are better suited in regeneration-heavy data and why (e.g., HMM/HSMM for sequence/dwell under bidirectional power flow; change-point for slope/variance shifts; trees/RF on state-aggregated features; simple pedal/vision rules noted as brittle). A full situation-by-situation ranking is beyond the scope of a scoping review, but these sections provide the requested, method-level guidance.
---
Comments 9: "The authors touch on the topic of explainable AI and note its absence. This is an important observation, but it is presented as a short remark. It would be useful to compare which approaches are generally more interpretable (e.g. threshold rules vs. deep networks). Such a comparison would add value, since transparency of algorithms is critical for industry."
Response 9: We agree, and the comparison has now been provided. See "Explainability across method families", which contrasts transparency across rules/thresholds (directly auditable), HMM/HSMM (inspectable transition and emission matrices), change-point detectors (cost/penalty curves), tree ensembles (feature importances and path traces), and deep sequence models (reported without saliency/SHAP in the corpus, effectively opaque).
---
Comments 10: "The conclusion of the article correctly emphasizes the importance of developing reliable algorithms for BEVs. However, the authors do not compare the findings of the review with the practical needs of mining companies. It would be useful to compare which methods are better suited for predictive maintenance, which for energy optimization, and which for shift planning. Such a comparison by tasks would make the review much more valuable for practitioners."
Response 10: We agree with this comment: without a practical mapping of model-to-downstream-task, this review remains purely theoretical. We rectify this by specifying, in the subsection "Cross-family strengths and weaknesses", which model families should be used for which task, and why (made particularly explicit in the table "Task-oriented method fit across families.")
Reviewer 2 Report
Comments and Suggestions for AuthorsIn this study, the authors present a PRISMA-ScR scoping review (2000–2025) of operational cycle detection for mobile mining equipment. They highlight gaps in public datasets, standardized metrics, and real-time validation, and assess the transferability of BEV methods from automotive to mining. Below are my observationsBelow are my observations:
- First, please provide a more detailed justification for excluding 1,593 records at a single stage (≈84%). Such a large one-step cull, without a transparent breakdown of reasons, raises concerns about the completeness and precision of the search strategy. Specify exactly which PRISMA phase this occurred in (title/abstract vs. full text), the deduplication procedure, operational screening criteria with borderline examples, the distribution of exclusion reasons (table), inter-rater agreement (e.g., Cohen’s κ), and the full database-specific search strings.
- With respect to Section 3.2, a direct mapping from sensed variables, sensor setups, and sampling regimes to model families would strengthen the argument. Because label availability and signal characteristics are decisive in selecting supervised vs. unsupervised paradigms, the current treatment feels disconnected from the downstream modeling workflow. Consider adding a bridging paragraph/table aligning sensor types, target labels, and corresponding algorithms.
- The section 4 offers a valuable perspective on BEV operational cycle detection in mining. Given the small evidence base, please broaden sources beyond mining. Include automotive BEV/HEV drive-cycle segmentation, heavy-duty fleets, off-highway machinery, and public datasets/standardized cycles (e.g., WLTP). Add explicit search strings (“drive-cycle segmentation,” “operational state detection,” “change-point detection,” “HMM/HSMM,” “self-supervised time series”) and map transferable techniques to mining.
Author Response
Comments 1: "First, please provide a more detailed justification for excluding 1,593 records at a single stage (≈84%). Such a large one-step cull, without a transparent breakdown of reasons, raises concerns about the completeness and precision of the search strategy. Specify exactly which PRISMA phase this occurred in (title/abstract vs. full text), the deduplication procedure, operational screening criteria with borderline examples, the distribution of exclusion reasons (table), inter-rater agreement (e.g., Cohen’s κ), and the full database-specific search strings."
Response 1: Thank you for this suggestion. We have expanded the Methods and Results and added supplementary tables to make screening transparency explicit. Specifically, we now state that the one-step exclusion occurred at the Title/Abstract phase, accounting for 1,593 of 1,757 screened records (90.7%), and this is cross-referenced in the Results where it is noted that 1,593 records were excluded and 164 proceeded to full-text review. We also clarified that Rayyan’s built-in de-duplication was used followed by manual DOI/title/year checks, resulting in 57 duplicates removed before screening. The eligibility criteria remain, but we added borderline examples to illustrate operational decisions, such as excluding energy-prediction papers without segmentation, stationary machinery, or simulation-only studies. To further increase transparency, we included the distribution of exclusion reasons in two new tables: Table S1 (Title/Abstract) and Table S2 (Full Text) in Appendix G.2, with counts and percentages (e.g., 1,435/1,593, 90.1% off-topic at Title/Abstract). Because screening was conducted by a single reviewer, Cohen’s κ is not applicable, which we clarified explicitly in both the Methods and under limitations. Finally, the verbatim Lens query has been provided in Supplement S1 for full reproducibility, ensuring alignment with PRISMA-ScR reporting expectations.
Comments 2: "With respect to Section 3.2, a direct mapping from sensed variables, sensor setups, and sampling regimes to model families would strengthen the argument. Because label availability and signal characteristics are decisive in selecting supervised vs. unsupervised paradigms, the current treatment feels disconnected from the downstream modelling workflow. Consider adding a bridging paragraph/table aligning sensor types, target labels, and corresponding algorithms."
Response 2: We agree with this comment. A bridging paragraph has already been added to strengthen our argument. Specifically, in §2.5 Data Charting Process, final paragraph, the manuscript states that model choice follows (i) available signals/rates and (ii) label density, and it maps low-rate CAN/GPS with sparse labels to Rules/HMM-HSMM/Clustering; the same streams with dense labels to CNN/LSTM/BiLSTM; high-rate hydraulics/IMU without labels to Change-point/Probabilistic clustering; high-rate channels with expert labels to CNN; and BEV power/SoC (1-10 Hz) to sequence models or hybrids. This mapping is further reinforced in the "Sensors, Data Processing and Feature Engineering", where it reiterates that high-rate sensors without labels align with change-point/VBGMM approaches, those with expert labels support CNN on windows, and edge-latency constraints motivate autoencoders or trees/RF as lighter alternatives.
Comments 3: "The section 4 offers a valuable perspective on BEV operational cycle detection in mining. Given the small evidence base, please broaden sources beyond mining. Include automotive BEV/HEV drive-cycle segmentation, heavy-duty fleets, off-highway machinery, and public datasets/standardized cycles (e.g., WLTP). Add explicit search strings (“drive-cycle segmentation,” “operational state detection,” “change-point detection,” “HMM/HSMM,” “self-supervised time series”) and map transferable techniques to mining."
Response 3: While we appreciate the suggestion to broaden Section 4, we reject it on the grounds that inclusion of more non-mining BEV cycle detection studies would be outside of the scope of this work. The aim of this scoping review is to map out the cycle detection techniques, methodologies, and models used specifically for mining vehicles. The inclusion of the three non-mining BEV operational cycle detection papers is meant to enhance the reader's understanding but not to draw any definite conclusions.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsOverall, the authors have qualitatively and specifically revised the manuscript to address my comments: they added missing analytical sections (real-time, explainability), expanded the practical utility (Table 4/5 with "method → task"), clarified the LHD↔Truck portability, and removed the controversial "Transferability Lens." This essentially addresses my key concerns. The remaining gaps are primarily engineering specifics (energy profiles at the edge, minimum open dataset format, head-to-head mini-bench, XAI applied checklist) and methodological refinements. These can be addressed without significantly restructuring the article and will enhance the practical value of the work.
Real-time power consumption/power profiles
There are inference time estimates and discussion that windowing/buffering is often a bottleneck, but there are no quantitative power consumption profiles on embedded platforms (CPU/GPU/edge-MCU) or "model vs. feature" comparisons for watts/shift hours. This could be strengthened with a mini-review of measurements from related works or our own exper
The BEV section relies on "related" platforms
The author's defense is valid for scoping, but it would be useful to more "ground" the BEV conclusions: a clear table of differences in mine conditions (GNSS shadows, dust/vibrations, narrow workings) and a note indicating which elements from related BEV cases are definitely transferable/not transferable. Currently, this is more of a narrative than a transferability checklist.
Explainability is a good framework, but a "how-to" is missing.
Section 4.2.3 addresses the transparency gradient and the lack of XAI for DL in the corpus. Add a practical mini-guide: a "set of artifacts for acceptance" (transition matrices, cost curves, feature importance, SHAP/grad-CAM) and a screenshot/case study for one deep learning model, even on an open, non-mining dataset.
Author Response
Comments 1: "Real-time power consumption/power profiles
There are inference time estimates and discussion that windowing/buffering is often a bottleneck, but there are no quantitative power consumption profiles on embedded platforms (CPU/GPU/edge-MCU) or "model vs. feature" comparisons for watts/shift hours. This could be strengthened with a mini-review of measurements from related works or our own experience."
Response 1: We agree with this comment. We appended a concise paragraph to the real-time subsection (Sec. 4.2.2) that quantifies edge power envelopes and clarifies feature-vs-inference energy trade-offs, standardizing reporting as J/inference and kWh/shift. We also added Table 5 (“Representative edge platforms & typical power”), which maps model families to MCU/NPU/TPU/GPU targets with typical power ranges.
Comments 2: "The BEV section relies on "related" platforms
The author's defense is valid for scoping, but it would be useful to more "ground" the BEV conclusions: a clear table of differences in mine conditions (GNSS shadows, dust/vibrations, narrow workings) and a note indicating which elements from related BEV cases are definitely transferable/not transferable. Currently, this is more of a narrative than a transferability checklist."
Response 2: We agree that clearer grounding of the BEV section is useful, though our focus remains on comparing methods rather than prescribing site-specific transferability. To address this, we added a brief Applicability considerations paragraph at the end of the BEV section (Sec. 3.3). This paragraph explicitly names underground constraints (GNSS denial, dust/vibration, confined headings, steep ramps) and clarifies which elements from adjacent BEV platforms are generally transferable (IMU/CAN segmentation, micro-trip clustering, sequence models, grade/payload features) and which are treated as non-transferable (GNSS-based features; vision methods sensitive to environmental conditions).
Comments 3: "Explainability is a good framework, but a "how-to" is missing.
Section 4.2.3 addresses the transparency gradient and the lack of XAI for DL in the corpus. Add a practical mini-guide: a "set of artifacts for acceptance" (transition matrices, cost curves, feature importance, SHAP/grad-CAM) and a screenshot/case study for one deep learning model, even on an open, non-mining dataset."
Response 3: We respectfully disagree that a single deep learning case study is appropriate here, as this is a review paper rather than an experimental demonstration. However, we agree that explainability should be defined more concretely. To address this, Sec. 4.2.3 now includes a short paragraph and Table 6 outlining, for each model family, the recommended explainability artifacts (e.g., transition matrices, cost curves, feature-importance/SHAP, Grad-CAM/Integrated Gradients) that enable practitioners to interpret and validate model outputs.
