Graph-Structured Persistent Memory for Efficient LLM-Based Computer Use Agents
Abstract
1. Introduction
2. Materials and Methods
2.1. Formal Model of the Memory-Augmented Agent
- A is the finite set of available actions, encompassing both primitive GUI operations (mouse clicks, keyboard inputs, scrolling) and composite tool invocations (parameterized macros such as SearchDrive(query));
- is the set of observable GUI states, where each state is characterized by a visual screenshot, OCR-extracted text, and accessibility-tree metadata;
- is the memory graph—a directed graph with node set N, edge set , and task descriptor set D;
- is the deterministic state transition function that governs action replay from memory;
- is the decision policy, which returns either a memorized action or the null symbol ⊥ to trigger fresh LLM planning;
- is the state recognition function that maps a current observation to its corresponding memory graph node, or to ⌀ if no matching node exists.
2.2. Memory Graph Structure
2.3. Reachability and Coverage Conditions for the Memory Graph
2.4. Perturbation Sensitivity and Graceful Degradation
2.5. Token Cost Model
2.6. Hierarchical Control Architecture
2.7. Decision Policy: Exploitation vs. Exploration
- Task Recognition: The Manager interprets the user instruction and queries the memory graph for matching task descriptors. The search combines textual similarity (instruction vs. stored task names/descriptions) with state matching (current UI context vs. stored node states).
- Memory-Driven Execution (Exploitation): If a matching task descriptor is found, the Manager retrieves the associated path and the Worker executes the stored action sequence with minimal LLM involvement—only for success verification and minor adaptations.
- LLM-Driven Planning (Exploration): If , the standard planning procedure is invoked: the Manager decomposes the task and the Worker uses LLM reasoning at each step. Upon successful completion, the new trajectory is integrated into .
- Memory Update: After task completion (via either path), the graph is updated: new nodes and edges for newly visited screens and actions, updated usage statistics for memory-served paths, and the Tool Generation module abstracts reusable subsequences into new task descriptors.
2.8. State Recognition and Hashing
2.9. Memory Graph Evolution and Maintenance
- Node addition: When , a new node is created and added to N.
- Edge addition: After successfully transitioning from state to via action a, the edge is added to E, where and .
- Tool generation: The Tool Generation module analyzes completed trajectories to identify reusable subsequences. Frequently traversed paths are abstracted into parameterized task descriptors and attached to the appropriate source node.
- Pruning: Periodically, maintenance operations merge duplicate nodes, generalize parameters, and remove low-utility edges to prevent graph bloat and preserve efficient retrieval.
3. Results
3.1. Experimental Setup
- S2-Mem Cold: Memory is initialized with only a small set of basic tools (e.g., login sequences for common applications), simulating a “cold start” scenario.
- S2-Mem Warm: Memory is pre-populated with tools accumulated from prior task executions, simulating an agent that has been operational for some time.
3.2. Evaluation Metrics
- 1.
- Token Consumption : The total number of LLM tokens (input + output) consumed during task execution as defined in Equation (10).
- 2.
- Execution Time : Wall-clock time from instruction receipt to task completion, encompassing both LLM processing delays and GUI interaction latencies.
- 3.
- Success Rate : Binary indicator of task completion—1 if the final state matches the target specification, 0 otherwise.
3.3. Task Success Rates
3.4. Token Cost and Execution Time
3.5. LLM Invocation Analysis
3.6. Component-Level Interpretation
3.7. Qualitative Behavior
4. Discussion
4.1. Interpretation Through the Reachability Framework
- Stabilizing efficiency: The memory utilization ratio (Definition 3) increases over the evaluation period. Initial tasks require extensive LLM planning (), while later tasks with overlapping subtasks benefit from stored trajectories ( on average).
- Graceful degradation as a design target: When the agent encounters novel states not represented in the memory graph (), it falls back to LLM-based planning. This mechanism is intended to avoid degradation relative to the baseline under the ideal fallback assumption, while practical failure modes are treated explicitly as limitations.
4.2. Control-Theoretic Interpretation
4.3. Decision-Making Perspective
4.4. Limitations and Perturbation Sensitivity
4.5. Quality of Generated Tools
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| CUA | Computer Use Agent |
| GUI | Graphical User Interface |
| LLM | Large Language Model |
| MRAC | Model-Reference Adaptive Control |
| MDP | Markov Decision Process |
| DSS | Decision Support System |
| OCR | Optical Character Recognition |
References
- Barabash, O.V. Construction of Functionally Stable Distributed Information Systems; NAOU: Kyiv, Ukraine, 2004; 226p. [Google Scholar]
- Kravchenko, Y.V.; Nikiforov, S.V. Definition of the problems of the theory of functional stability in relation to application in computer systems. Telecommun. Inf. Technol. 2014, 1, 12–18. [Google Scholar]
- Pichkur, V.; Sobchuk, V.; Cherniy, D. Mathematical Models and Control of Functionally Stable Technological Process. In Computational Methods and Mathematical Modeling in Cyberphysics and Engineering Applications; Wiley: Hoboken, NJ, USA, 2024; Volume 1, pp. 101–119. [Google Scholar]
- Sager, P.J.; Meyer, B.; Yan, P.; von Wartburg-Kottler, R.; Etaiwi, L.; Enayati, A.; Nobel, G.; Abdulkadir, A.; Grewe, B.F.; Stadelmann, T. A Comprehensive Survey of Agents for Computer Use: Foundations, Challenges, and Future Directions. arXiv 2025, arXiv:2501.16150. [Google Scholar] [CrossRef]
- Agashe, S.; Wong, K.; Tu, V.; Yang, J.; Li, A.; Wang, X.E. Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents. arXiv 2025, arXiv:2504.00906. [Google Scholar] [CrossRef]
- Zamrii, I.; Vyshnivskyi, V.; Sobchuk, V. Method of Ensuring the Functional Stability of the Information System Based on Detection of Intrusions and Reconfiguration of Virtual Networks. CEUR Workshop Proc. 2024, 3654, 252–264. [Google Scholar]
- Barabash, O.; Makarchuk, A.; Open’ko, P.; Korotin, S. Application of SVM, FFNNs, k-NN and Their Ensembles for Identifying Functionally Reliable Systems. Axioms 2025, 14, 237. [Google Scholar] [CrossRef]
- Åström, K.J.; Murray, R.M. Feedback Systems: An Introduction for Scientists and Engineers; Princeton University Press: Princeton, NJ, USA, 2008. [Google Scholar]
- Ioannou, P.A.; Sun, J. Robust Adaptive Control; Dover Publications: Mineola, NY, USA, 2006. [Google Scholar]
- Narendra, K.S.; Annaswamy, A.M. Stable Adaptive Systems; Dover Publications: Mineola, NY, USA, 2005. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Bellman, R. Dynamic Programming; Princeton University Press: Princeton, NJ, USA, 1957. [Google Scholar]
- Lee, S.; Choi, J.; Lee, J.; Wasi, M.H.; Choi, H.; Ko, S.Y.; Oh, S.; Shin, I. Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation. arXiv 2023, arXiv:2312.03003. [Google Scholar]
- Jiang, W.; Zhuang, Y.; Song, C.; Yang, X.; Zhou, J.T.; Zhang, C. AppAgentX: Evolving GUI Agents as Proficient Smartphone Users. arXiv 2025, arXiv:2503.02268. [Google Scholar]
- Zhang, Z.; Bo, X.; Ma, C.; Li, R.; Chen, X.; Dai, Q.; Zhu, J.; Dong, Z.; Wen, J.-R. A Survey on the Memory Mechanism of Large Language Model Based Agents. arXiv 2024, arXiv:2404.13501. [Google Scholar] [CrossRef]
- Packer, C.; Wooders, S.; Lin, K.; Fang, V.; Patil, S.G.; Stoica, I.; Gonzalez, J.E. MemGPT: Towards LLMs as Operating Systems. arXiv 2023, arXiv:2310.08560. [Google Scholar]
- Xu, W.; Liang, Z.; Mei, K.; Gao, H.; Tan, J.; Zhang, Y. A-Mem: Agentic Memory for LLM Agents. arXiv 2025, arXiv:2502.12110. [Google Scholar] [CrossRef]
- Musienko, A.; Vorvul, D. Analysis of the Efficiency and Comparison of Retrieval-Augmented Generation Systems in Mergers and Acquisitions. In Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2025. [Google Scholar] [CrossRef]
- Li, H.; Ning, J.; Tong, S. Distributed Reinforcement Learning Optimal Cluster Consensus Control for Takagi–Sugeno Fuzzy Multiagent Systems. IEEE Trans. Artif. Intell. 2026, 7, 1792–1802. [Google Scholar] [CrossRef]
- Xie, T.; Zhang, D.; Chen, J.; Li, X.; Zhao, S.; Cao, R.; Hua, T.J.; Cheng, Z.; Shin, D.; Lei, F.; et al. OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments. arXiv 2024, arXiv:2404.07972. [Google Scholar] [CrossRef]
- Bellini, E.; Coconea, L.; Nesi, P. A Functional Resonance Analysis Method Driven Resilience Quantification for Socio-Technical Systems. IEEE Syst. J. 2020, 14, 1234–1244. [Google Scholar] [CrossRef]
- Turban, E.; Sharda, R.; Delen, D. Decision Support and Business Intelligence Systems, 9th ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2011. [Google Scholar]
- Pichkur, V.; Sobchuk, V.; Cherniy, D.; Ryzhov, A. Functional Stability of Production Processes as Control Problem of Discrete Systems with Change of State Vector Dimension. Bull. Taras Shevchenko Natl. Univ. Kyiv. Phys. Math. 2024, 1, 105–110. [Google Scholar] [CrossRef] [PubMed]



| Method | 15-Step Tasks | 50-Step Tasks |
|---|---|---|
| Agent S2 w/Claude-4.5-Sonnet (S2-Base) | 36.9 | 46.5 |
| S2-Mem w/Claude-4.5-Sonnet Cold | 36.9 | 46.7 |
| S2-Mem w/Claude-4.5-Sonnet Warm | 36.9 | 46.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Vorvul, D.; Musienko, A.; Galchenko, I.; Myroniuk, M.; Sobchuk, A. Graph-Structured Persistent Memory for Efficient LLM-Based Computer Use Agents. Axioms 2026, 15, 415. https://doi.org/10.3390/axioms15060415
Vorvul D, Musienko A, Galchenko I, Myroniuk M, Sobchuk A. Graph-Structured Persistent Memory for Efficient LLM-Based Computer Use Agents. Axioms. 2026; 15(6):415. https://doi.org/10.3390/axioms15060415
Chicago/Turabian StyleVorvul, Danylo, Andrii Musienko, Iryna Galchenko, Mykola Myroniuk, and Andrii Sobchuk. 2026. "Graph-Structured Persistent Memory for Efficient LLM-Based Computer Use Agents" Axioms 15, no. 6: 415. https://doi.org/10.3390/axioms15060415
APA StyleVorvul, D., Musienko, A., Galchenko, I., Myroniuk, M., & Sobchuk, A. (2026). Graph-Structured Persistent Memory for Efficient LLM-Based Computer Use Agents. Axioms, 15(6), 415. https://doi.org/10.3390/axioms15060415

