AIPR: An Automated Instruction-Level Patching and Rewriting Framework for Sustainable RISC-V Research
Abstract
1. Introduction
- RQ: Can executable-level binary rewriting effectively mitigate reproducibility and sustainability challenges caused by toolchain fragmentation in RISC-V systems research, while maintaining functional robustness and performance efficiency?
- Executable-Level Methodology: This work shifts the experimental intervention point from fragile source-level recompilation to stable executable-level modification to improve research reproducibility under toolchain fragmentation.
- AIPR Framework: The AIPR framework automates instruction-level analysis, immediate reconstruction, and direct binary patching within ELF binaries.
- Instruction Encoding Automation: A specialized encoding engine ensures correct split-immediate recalculation and sign-extension handling during binary rewriting.
- Accelerated Verification: The proposed approach achieves a 29.57× speedup in artifact generation compared to GCC-based recompilation, significantly reducing verification turnaround time.
2. Background
2.1. Structural Barriers to Reproducibility in Systems Research
2.2. Binary Rewriting as an Alternative Intervention Point
3. AIPR Framework
3.1. AIPR Framework Implementation
- ELF Parsing and Section Discovery: The engine parses the section header table to identify the offset and size of the .text section. This stage remains a prerequisite to ensure that all modifications occur strictly within executable boundaries.
- Direct Binary Patching: The framework overwrites the target byte stream at specific offsets. The current version excludes any modifications to the ELF header. This approach ensures file integrity and simulator compatibility. It also avoids complex structural recalculations.
3.2. Operational Methodology
4. Experimental Results and Analysis
4.1. Experimental Environments
4.2. Functional Robustness and Reliability Validation
4.3. Turnaround Time Comparison and Scalability Analysis
- Baseline (Recompilation): Modifying source-level parameters followed by a full toolchain execution (riscv32-unknown-elf-gcc) including parsing, optimization, and linking for each variant.
- AIPR (Binary Patching): Utilizing the proposed framework to directly manipulate immediates and opcodes within the pre-compiled artifacts.
| Metric | Recompilation Flow | AIPR Framework |
|---|---|---|
| Total Execution Time | ∼6800 s (∼1.89 h) | 230 s (29.57×) |
| Average Time per Trial | 3.4 s | 0.115 s |
| Peak CPU Load | High (Multi-threaded) | Negligible |
| Disk I/O Intensity | High (Object Files) | Low (In-place) |
4.4. Limitations and Comparative Analysis
5. Conclusions and Future Work
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AIPR | Automated Instruction-level Patching and Rewriting |
| RSE | Research Software Engineer |
| ISA | Instruction Set Architecture |
| ELF | Executable and Linkable Format |
| RTL | Register Transfer Level |
| RVC | RISC-V Compressed Instruction Extension |
| LUI | Load Upper Immediate |
| ADDI | Add Immediate |
| SFR | Special Function Register |
| MMIO | Memory-Mapped I/O |
| CSR | Control and Status Register |
| LE | Little-Endian Representation |
Appendix A. Case Study: RISC-V Instruction-Level Transformation
Appendix A.1. C-Based Hardware Register Initialization Example
| 1. C Code for Testing |
| … |
| #define BASE_ADDR_SFR_MAP_CORE_0 0x2000f000 #define BASE_ADDR_SFR_MAP_CORE_1 0x20010000 #define BASE_ADDR_SFR_MAP_CORE_2 0x20011000 #define MEMORY_TAG_EXT_CFG_CORE_0 (BASE_ADDR_SFR_MAP_CORE_0 + 0x0c30) #define MEMORY_TAG_EXT_CFG_CORE_1 (BASE_ADDR_SFR_MAP_CORE_1 + 0x0c30) #define MEMORY_TAG_EXT_CFG_CORE_2 (BASE_ADDR_SFR_MAP_CORE_2 + 0x0c30) REG32(MEMORY_TAG_EXT_CFG_CORE_0) = 0xac0c1e09; REG32(MEMORY_TAG_EXT_CFG_CORE_1) = 0xac0c1e09; REG32(MEMORY_TAG_EXT_CFG_CORE_2) = 0xac0c1e09; |
| … |
| 2. Assembly Code for RISC-V |
| … |
| lui s3,0x20010 lui a5,0xac0c2 addi a5,a5,-503 # ac0c1e09 <_stack+0xac0bc091> sw a5,-976(s3) # 2000fc30 <_stack+0x20009eb8> lui s2,0x20011 sw a5,-976(s2) # 20110c30 <_stack+0x2000aeb8> lui s1,0x20012 sw a5,-976(s1) # 20111c30 <_stack+0x2000beb8> |
| … |
- Address and Offset Definition: The #define preprocessor directives establish the BASE_ADDR_SFR_MAP as the logical starting point for hardware control. Specifically, the 0x0c30 offset points to the MEMORY_TAG_EXT_CFG registers, which serves as a core-specific configuration unit for MTE.
- Atomic Configuration of Security Parameters: The statement REG32(...) = 0xac0c1e09 is more than a simple assignment; it is a critical initialization step that commits multiple configuration parameters to the hardware in a single atomic memory write operation.
Appendix A.2. Bit-Field Encoding of a Representative Control Register Value
- Global MTE Enable (Bit [31]: 0x1): Setting the bit to 1 activates the hardware tag comparison engine within the respective processor core. If this bit is not set, all memory access tag checks are bypassed regardless of other settings.
- Tag Check Guard Region Configuration (Bit [30:24]: 0x2c): The value 0x2c defines the granularity and range for tag validation during memory access. This communicates the physical size of the “Guard Region” to the hardware.
- Hardware Tag Generation Algorithm (Bit [23:16]: 0x0c): The 0x0c identifier selects the hardware-supported algorithm for memory tag generation, such as sequential or random-based methods.
- Tag Mismatch Interrupt and Exception Policy (Bit [15:8]: 0x1e): The value 0x1e instructs the hardware to trigger an immediate interrupt or security exception upon tag mismatch detection to prevent data leakage.
- Per-Core Security Domain ID (Bit [7:0]: 0x09): The 0x09 serves as a unique Security Domain ID to facilitate multi-core resource isolation and prevent unauthorized tag access across domains.
Appendix A.3. RISC-V Assembly Translation of High-Level Configuration Code
Appendix A.3.1. Immediate Value Generation Using LUI/ADDI Sequences
- lui a5, 0xac0c2: The Load Upper Immediate (LUI) instruction loads the upper 20 bits of the a5 register with 0xac0c2 and clears the lower 12 bits to zero, creating an intermediate value of 0xac0c2000.
- addi a5, a5, −503: An Add Immediate (ADDI) instruction adds the 12-bit signed immediate −503 (0xFFFFFE09) to the value in a5. This operation utilizes sign extension, resulting in the final desired security policy value of 0xac0c1e09 within the register. This method is highly efficient as it constructs the 32-bit value entirely within the instruction pipeline without requiring a separate data memory access.
Appendix A.3.2. Address Management and Storage Strategy
- lui s3, 0x20010: This loads the base address of the SFR map (0x20010000) on the system bus into the s3 register, setting the base pointer.
- sw a5, −976(s3): The Store Word (SW) instruction writes the prepared MTE configuration value (a5) to the physical address calculated as s3 − 976. The resulting address, 0x2000fc30, corresponds exactly to the Core 0 MTE configuration register in the hardware specification30.
Appendix A.4. Impact of Arithmetic Sign-Extension Constraints on Binary Parameter Modification
- Bit [30:24]: 0x3c: Guard Region Size increased for stricter memory boundary checks (Changed from 0x2c → 0x3c).
- Bit [15:8]: 0x3e: Exception Policy updated to Synchronous Fault to halt execution immediately upon mismatch (Changed from 0x1e → 0x3e).
- Bit [7:0]: 0x01: Security Domain ID updated to a restricted kernel-level domain (Changed from: 0x09 → 0x01).
- Lower 12 bits: 0xbc0c3e01 → 0xe01. Since 0xe01 ≥ 0x800, the value is treated as a negative number in two’s complement (0xe01 − 0x1000 = −511).
- Upper 20 bits: 0xbc0c3 → 0xbc0c4. The base value 0xbc0c3 must be incremented to account for the subtraction in the next step: 0xbc0c3 + 1 = 0xbc0c4.
- Original: lui a5, 0xac0c2 → New: lui a5, 0xbc0c4
- Original: addi a5, a5, −503 → New: addi a5, a5, −511
Appendix A.5. Instruction-to-Machine-Code Field Transformation
| Original Machine Code | Original Assembly | Patched Machine Code | Patched Assembly | |
|---|---|---|---|---|
| 200109b7 | lui | s3,0x20010 | 200109b7 | lui s3,0x20010 |
| ac0c27b7 | lui | a5,0xac0c2 | bc0c47b7 | lui a5,0xbc0c4 |
| e0978793 | addi | a5,a5,−503 | e0178793 | addi a5,a5,−511 |
| c2f9a823 | sw | a5,−976(s3) | c2f9a823 | sw a5,−976(s3) |
| 20011937 | lui | s2,0x20011 | 20011937 | lui s2,0x20011 |
| c2f92823 | sw | a5,−976(s2) | c2f92823 | sw a5,−976(s2) |
| 200124b7 | lui | s1,0x20012 | 200124b7 | lui s1,0x20012 |
| c2f4a823 | sw | a5,−976(s1) | c2f4a823 | sw a5,−976(s1) |
| 6a99 | c.lui | s5,0x6 | 6a99 | c.lui s5,0x6 |
Appendix A.5.1. LUI Instruction Update
- Original Machine Code (0xac0c27b7): This 32-bit word is composed of the opcode (0x37 for LUI), the destination register a5, and the original immediate field 0xac0c2.
- Patched Machine Code (0xbc0c47b7): To implement the elevated security policy (0xbc0c3e01), the AIPR framework targets the immediate bit-field (bits 31:12). By calculating the new value 0xbc0c3 and applying the sign-extension compensation to reach 0xbc0c4, the framework physically rewrites the upper bits of the instruction. The opcode and register fields remain static, ensuring the instruction still targets a5 but with the updated policy bits.
Appendix A.5.2. ADDI Instruction Update
- Original Machine Code (0xe0978793): The bits 31:20 contain the original 12-bit immediate value of −503 (0xe09 in hex).
- Patched Machine Code (0xe0178793): The AIPR engine replaces this specific 12-bit immediate field with the new calculated value of −511 (0xe01 in hex). Because this modification is strictly localized to the immediate bit-field, the internal data flow—adding a constant to a5 and storing the result back to a5—is maintained without side effects on the surrounding hardware logic.
Appendix A.6. Binary Rewriting via Hexadecimal Stream Manipulation
| Address Offset | Hexadecimal Data (32-Bit Words) | |||
|---|---|---|---|---|
| @00000c04 | 23a20790 | 23a40790 | b7090120 | b7270cac |
| @00000c08 | 938797e0 | 23a8f9c2 | 37190120 | 2328f9c2 |
| @00000c0c | b7240120 | 23a8f4c2 | 996a9387 | fa4023ae |
| @00000c10 | f9c0232e | f9c023ae | f4c02945 | eff0dfe0 |
| @00000c14 | 93873a40 | 23aef9c0 | 232ef9c0 | 23aef4c0 |
| Address | Instruction | Original Hex (LE) | Patched Hex (LE) | Key Constraint |
|---|---|---|---|---|
| @00000c04 | LUI a5 | b7270cac | b7470cbc | Sign-extension |
| @00000c08 | ADDI a5 | 938797e0 | 938717e0 | 12-bit immediate limit |
| @00000c0c | C.LUI s5 | 996a | 996a | Half-word alignment |
Appendix B. Implementation Direction for MTE in V-FRONT RTL
| 1. MTE Configuration Register (CSR) Logic |
// Dedicated CSR to store security parameters reg [31:0] mte_config_reg; wire global_mte_en = mte_config_reg[31]; // Bit [31]: Global Enable wire [6:0] guard_sz = mte_config_reg[30:24]; // Bit [30:24]: Guard Region Size wire [7:0] fault_pol = mte_config_reg[15:8]; // Bit [15:8]: Exception Policy wire [7:0] domain_id = mte_config_reg[7:0]; // Bit [7:0]: Security Domain ID |
| 2. Tag Comparison and Exception Generation (MEM Stage) |
// Hardware-level tag validation during memory access wire [3:0] stored_tag; // Metadata retrieved from Tag RAM wire [3:0] provided_tag = mem_addr_in[31:28]; // Top 4 bits used as Tag // Mismatch detection logic wire tag_mismatch = (provided_tag != stored_tag) && global_mte_en; // Exception triggering based on Synchronous Fault policy assign mte_fault_signal = (tag_mismatch && (fault_pol == 8’h3E)) ? 1’b1 : 1’b0; |
| 3. Pipeline Flush and Exception Handling |
| // Integrated trap logic to halt execution upon mismatch always @(*) begin if (mte_fault_signal) begin pipeline_flush = 1’b1; // Invalidate following instructions exception_vector = 32’h0000_0100; // Jump to security handler end end |
References
- Akram, A.; Sawalha, L. A survey of computer architecture simulation techniques and tools. IEEE Access 2019, 7, 78120–78145. [Google Scholar] [CrossRef]
- Neelu Kumari, K.S.; Murali, L.; Vijayabaskar, S.; Gopalakrishnan, R. A Reconfigured Architecture of Mathematical Morphology Using Fuzzy Logic Controller for ECG QRS Detection. J. Electr. Eng. Technol. 2025, 20, 1789–1802. [Google Scholar]
- Vieira, J.; Roma, N.; Falcao, G.; Tomás, P. gem5-accel: A pre-rtl simulation toolchain for accelerator architecture validation. IEEE Comput. Archit. Lett. 2023, 23, 1–4. [Google Scholar]
- Karandikar, S.; Mao, H.; Kim, D.; Biancolin, D.; Amid, A.; Lee, D.; Pemberton, N.; Amaro, E.; Schmidt, C.; Chopra, A.; et al. FireSim: FPGA-accelerated cycle-exact scale-out system simulation in the public cloud. In Proceedings of the 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, 2–6 June 2018; pp. 29–42. [Google Scholar]
- Jiang, F.; Maeda, R.K.; Feng, J.; Chen, S.; Chen, L.; Li, X.; Xu, J. Fast and accurate statistical simulation of shared-memory applications on multicore systems. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 2455–2469. [Google Scholar] [CrossRef]
- Purraji, M.; Zamiri, E.; Sanchez, A.; de Castro, A. Rapid Prototyping for Design and Test of FPGA-Based Model Predictive Controllers for Power Converters. J. Electr. Eng. Technol. 2025; in press. [Google Scholar] [CrossRef]
- Perkel, J.M. Democratic databases: Science on GitHub. Nature 2016, 538, 127–128. [Google Scholar] [CrossRef] [PubMed]
- Lowndes, J.S.S.; Best, B.D.; Scarborough, C.; Afflerbach, J.C.; Frazier, M.R.; O’Hara, C.C.; Jiang, N.; Halpern, B.S. Our path to better science in less time using open data science tools. Nat. Ecol. Evol. 2017, 1, 0160. [Google Scholar] [CrossRef] [PubMed]
- Collberg, C.; Proebsting, T.A. Repeatability in computer systems research. Commun. ACM 2016, 59, 62–69. [Google Scholar] [CrossRef]
- Sharifi, S.; Reuel, N.; Kallmyer, N.; Sun, E.; Landry, M.P.; Mahmoudi, M. The issue of reliability and repeatability of analytical measurement in industrial and academic nanomedicine. ACS Nano 2022, 17, 4–11. [Google Scholar] [CrossRef] [PubMed]
- Gundersen, O.E.; Kjensmo, S. State of the art: Reproducibility in artificial intelligence. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 1644–1651. [Google Scholar]
- Salkhordeh, R.; Brinkmann, A. On the reproducibility of computer architecture research in the era of open source. In Proceedings of the International Conference on Performance Engineering, Virtual, 19–23 April 2021. [Google Scholar]
- Konersmann, M.; Kaplan, A.; Kuhn, T.; Heinrich, R.; Koziolek, A.; Reussner, R.; Jürjens, J.; al-Doori, M.; Boltz, N.; Ehl, M.; et al. Evaluation methods and replicability of software architecture research objects. In Proceedings of the 19th IEEE International Conference on Software Architecture (ICSA), Honolulu, HI, USA, 12–15 March 2022; pp. 157–168. [Google Scholar]
- Goth, F.; Thiele, J.P.; Project, T.T. Foundational competencies and specializations of a research software engineer. Comput. Sci. Eng. 2025, 27, 27–34. [Google Scholar] [CrossRef]
- Balas, R.; Benini, L. RISC-V for real-time MCUs: Software optimization and microarchitectural gap analysis. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 1–5 February 2021; pp. 874–877. [Google Scholar]
- Duck, G.J.; Gao, X.; Roychoudhury, A. Binary rewriting without control flow recovery. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), London, UK, 15–19 June 2020; pp. 151–163. [Google Scholar]
- Wenzl, M.; Merzdovnik, G.; Ullrich, J.; Weippl, E. From hack to elaborate technique—A survey on binary rewriting. ACM Comput. Surv. 2019, 52, 1–37. [Google Scholar] [CrossRef]
- Park, J.; Yun, I.; Ryu, S. Bridging the gap between real-world and formal binary lifting through filtered simulation. Proc. ACM Program. Lang. 2025, 9, 898–926. [Google Scholar] [CrossRef]
- Scott, R.G.; Boston, B.; Davis, B.; Diatchki, I.; Dodds, M.; Hendrix, J.; Matichuk, D.; Quick, K.; Ravitch, T.; Robert, V.; et al. Macaw: A machine code toolbox for the busy binary analyst. arXiv 2024, arXiv:2407.06375. [Google Scholar] [CrossRef]
- Mezger, B.W.; Santos, D.A.; Dilillo, L.; Zeferino, C.A.; Melo, D.R. A survey of the RISC-V architecture software support. IEEE Access 2022, 10, 51394–51411. [Google Scholar] [CrossRef]
- Hassan, Q.F.; Sagahyroon, A. RISC-V: A comprehensive overview of an emerging ISA for the AI-IoT era. Adv. Internet Things, 2025; in press. [Google Scholar]
- Boubakri, M.; Zouari, B. GATOR-V: Accelerating the RISC-V confidential computing ecosystem with a production-grade TEE. IEEE Access 2025, 13, 210892–210916. [Google Scholar]
- Barker, M.; Chue Hong, N.P.; Katz, D.S.; Lamprecht, A.L.; Martinez-Ortiz, C.; Psomopoulos, F.; Harrow, J.; Castro, L.J.; Gruenpeter, M.; Martinez, P.A.; et al. Introducing the FAIR principles for research software. Sci. Data 2022, 9, 622. [Google Scholar] [CrossRef] [PubMed]
- Waterman, A.; Asanović, K. The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA; RISC-V Foundation: San Francisco, CA, USA, 2019. [Google Scholar]
- Patterson, D.; Waterman, A. The RISC-V Reader: An Open Architecture Atlas; Strawberry Canyon: Berkeley, CA, USA, 2017. [Google Scholar]
- Dikmen, K. V-FRONT: A Five-Stage 32-Bit RISC-V Processor in Verilog. GitHub. Available online: https://github.com/kagandikmen/V-FRONT (accessed on 12 January 2026).
- Zhang, L.; Yang, X.; Cheng, X.; Cheng, W.; Lin, Y. Few-Shot Image Classification Algorithm Based on Global–Local Feature Fusion. AI 2025, 6, 265. [Google Scholar] [CrossRef]

def calculate_riscv_immediates(self, target_value): low_12 = target_value & 0xFFF up_20 = (target_value >> 12) & 0xFFFFF # Compensation for RISC-V ADDI sign-extension if low_12 >= 0x800: up_20 = (up_20 + 1) & 0xFFFFF return up_20, low_12 def encode_u_type(self, imm_20, opcode=0x37): # Format: imm[31:12] | rd[11:7] | opcode[6:0] instr = (imm_20 << 12) | (self.rd << 7) | opcode return instr def encode_i_type(self, imm_12, rs1=None, funct3=0x0, opcode=0x13): # Format: imm[11:0] | rs1[19:15] | funct3[14:12] | rd[11:7] | opcode[6:0] if rs1 is None: rs1 = self.rd # Default to self-increment instr = (imm_12 << 20) | (rs1 << 15) | (funct3 << 12) | (self.rd << 7) | opcode return instr def generate_binary_pair(self, new_param): up, low = self.calculate_riscv_immediates(new_param) lui_bin = self.encode_u_type(up) addi_bin = self.encode_i_type(low) return lui_bin, addi_bin |
def find_instruction_offset(self, original_hex_sequence): target_bytes = bytes.fromhex(original_hex_sequence) offset = self.binary_data.find(target_bytes) if offset == -1: raise ValueError("[AIPR_Error]Target_instruction_pattern_not_found.") return offset def apply_hex_patch(self, offset, patched_hex_sequence): patch_bytes = bytes.fromhex(patched_hex_sequence) with open(self.file_path, ’r+b’) as f: f.seek(offset) f.write(patch_bytes) f.flush() # Verification Logic f.seek(offset) if f.read(len(patch_bytes)) != patch_bytes: raise IOError("[AIPR_Error]Patch_verification_failed.") def execute_automated_flow(self, search_pattern, replace_pattern): self.load_binary() try: target_offset = self.find_instruction_offset(search_pattern) self.apply_hex_patch(target_offset, replace_pattern) except Exception as e: print(f"Workflow_Interrupted:{e}") |
| Category | ID Range | Overwriting-Specific Challenges |
|---|---|---|
| Type 1 | B1–B5 | Standard Immediate Replacement: The engine updates isolated 12-bit or 20-bit immediate fields within a single instruction where no arithmetic dependencies exist. |
| Type 2 | B6–B10 | Arithmetic Correction: The process handles the 0 × 800 sign-extension boundary, which requires a simultaneous update of LUI and ADDI pairs to prevent value corruption. |
| Type 3 | B11–B15 | Base Address Modification: The framework targets the reconstruction of 32-bit memory-mapped register addresses through multi-instruction sequence analysis and offset alignment. |
| Type 4 | B16–B20 | Sequence Propagation: The system manages the consistent update of multiple dependent instructions across different hardware units to ensure global policy enforcement. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Choi, J. AIPR: An Automated Instruction-Level Patching and Rewriting Framework for Sustainable RISC-V Research. Appl. Sci. 2026, 16, 1461. https://doi.org/10.3390/app16031461
Choi J. AIPR: An Automated Instruction-Level Patching and Rewriting Framework for Sustainable RISC-V Research. Applied Sciences. 2026; 16(3):1461. https://doi.org/10.3390/app16031461
Chicago/Turabian StyleChoi, Juhee. 2026. "AIPR: An Automated Instruction-Level Patching and Rewriting Framework for Sustainable RISC-V Research" Applied Sciences 16, no. 3: 1461. https://doi.org/10.3390/app16031461
APA StyleChoi, J. (2026). AIPR: An Automated Instruction-Level Patching and Rewriting Framework for Sustainable RISC-V Research. Applied Sciences, 16(3), 1461. https://doi.org/10.3390/app16031461
