Previous Article in Journal
Digital Tools for Innovation in Craft Design: Lessons from a Multi-Domain European Design Pilot
 
 
Article
Peer-Review Record

Automating Spatial Visualisation of Handwritten Vector Equations Using Large Vision Models in Pre-Tertiary Mathematics

Multimodal Technol. Interact. 2026, 10(6), 68; https://doi.org/10.3390/mti10060068 (registering DOI)
by Kenneth Y. T. Lim 1,*, Nguyen Thanh Minh Le 2 and Sopheap Chanoudam 2
Reviewer 1: Anonymous
Reviewer 2:
Multimodal Technol. Interact. 2026, 10(6), 68; https://doi.org/10.3390/mti10060068 (registering DOI)
Submission received: 26 April 2026 / Revised: 8 June 2026 / Accepted: 11 June 2026 / Published: 14 June 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

A very interesting paper, carefully written and containing  very relevant information concerning the current (and future) of mathematics education in the context of AI.

I have to confess I am not an expert in the tools (prompting techniques) the authors have described and developed, but I appreciate as highly relevant and well done  the goals of the paper, the description of the experience, the results and the discussion.

My only suggestion is about Figures 9, 10, 11 and 12, concerning "...some example visualisations of other vector operations..." It is not clear (in my opinion)  in the paper if the presented visualizations, that in my opinion are the final and more important output to help students working and visualizing 3D geometry, have any relevant manipulation features (as Dynamic Geometry programs usually have) like dragging the vectors origin, changing the perspective, zooming in or out, etc..And I consider this information as one of the most important facts concerning "...visualising and manipulating objects in three-dimensional space based on abstract equations.." as recognized by the authors in the Introduction. 

 

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

1. Originality and Relevance

The manuscript targets a highly relevant and innovative intersection: utilizing Large Vision Models (LVMs) to resolve core cognitive challenges in advanced secondary mathematics education. Specifically, it addresses a well-documented educational gap—the difficulty pre-tertiary students face when attempting to translate abstract symbolic notations of three-dimensional vectors into accurate spatial visualizations. Using computer vision to automate immediate, interactive 3D graphical feedback from handwritten equations represents a highly original and practical pedagogical intervention.

2. Contribution to the Subject Area

Compared with existing published material that typically investigates general AI chat tools or pre-rendered virtual math simulations, this study provides a significant contribution by evaluating a substantial, custom dataset. Testing the system against 1,000 handwritten vector equations modeled directly after a standardized national curriculum (the Singapore-Cambridge GCE 'A' Level H2 Mathematics syllabus) gives this study high empirical value and contextual authenticity. Establishing GPT-4o as a capable baseline for interpreting handwritten syntax offers solid benchmarks for developers building multimodal educational software.

3. Consistency of Conclusions

The conclusions are consistent with the experimental evidence and arguments presented. The data adequately shows that immediate multimodal visual feedback bridges the abstract-geometric cognitive gap for students. The authors are appropriately cautious in acknowledging that while the LVM serves as an effective parsing engine, specific instructional guardrails are still necessary to completely optimize the tool for self-directed learning environments.

4. Appropriateness of References

The references are appropriate, well-targeted, and up-to-date. The bibliography cleanly connects foundational machine learning concepts (such as Vision Transformers and Chain-of-Thought prompting models) with specialized handwriting recognition competitions (e.g., CROHME datasets) and classic spatial reasoning pedagogy.

5. Additional Comments on Tables and Figures

The data presentation effectively demonstrates the model's accuracy and performance limits across different equation formats. To make the report fully thorough for publication, the following minor points should be polished:

  • Ensure that the performance metrics separating correct, partially correct, and failed vision-model conversions are displayed with uniform notation formatting across all analytical segments.

  • Suggestion for Authors: While the textual description of the model pipeline is clear, adding a simple structural diagram showing the execution workflow—from the user's raw handwritten vector input to the LVM segmentation layer, and finally to the generated 3D graphical visualization output—would immensely benefit readers coming from purely pedagogical backgrounds.

Minor Revisions Recommended:

  1. Error Characterization: Provide a brief breakdown or qualitative example of the most common handwriting styles or symbolic syntax configurations that triggered failures in the model's parsing baseline to assist future educational tool developers.

  2. Technical Environment Disclosure: Briefly state the exact parameters (such as temperature or API model versions) used during testing to support future replication of the accuracy metrics.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Back to TopTop