ErrorExplainer: Automated Extraction of Error Contexts from Smart Contracts
Abstract
:1. Introduction
- I identify the practical challenges of non-standardized error messages that make it difficult for users to understand errors in Web3 services.
- I propose ErrorExplainer, an automated approach that analyzes smart contract code to generate rich, structured error descriptions matched to the user’s execution context.
- I demonstrate the effectiveness of ErrorExplainer on real-world smart contracts by evaluating its identification accuracy, information completeness, and matching fitness.
2. Background
2.1. Transaction Execution Environment
- Transactions on EVM. Ethereum-compatible blockchains execute all state–transition logic on the Ethereum Virtual Machine (EVM) [7]. Each transaction is authenticated with the sender’s signature before execution. A transaction either transfers assets or calls functions of smart contracts. When calling a function, the transaction adds a read-only chunk of bytes, referred to as “calldata”, to specify the function selector and the input arguments. The smart contract directly uses the calldata to execute its function. The called function is selected by matching the first four bytes of calldata with the Keccak-256 hash of the function signatures. The rest of the calldata contain the application binary interface (ABI)-encoded input arguments.
- Crypto Wallets. A crypto wallet is the primary user client for blockchain systems and Web3 services. It manages accounts comprising private keys and their corresponding addresses derived from the public keys. Users interact with blockchain systems by signing transactions and sending them to the blockchain network through the crypto wallet. Therefore, the crypto wallet works as the closest interface to users in Web3.
2.2. Transaction Errors
2.3. Transaction-Reverting Statements
- require (condition, error message): This statement is commonly used to enforce input validation or preconditions. If the condition is false, the transaction reverts and returns the message.
- assert (condition): assert is designed to check invariants in the contract. If the condition is false, it reverts with Panic(0x01). assert does not support a custom message.
- revert (error message): If the transaction encounters revert, it unconditionally reverts and returns the message.
- throw: This statement has been deprecated since Solidity 0.4.13, but has been retained in legacy smart contracts. throw does not support a custom message and returns no message.
2.4. Open Challenges in Smart Contract Error Reporting
- Heterogeneity of Error Messages: Solidity compiles errors into different ABI payload patterns: (1) the standard Error(string) payload, emitted by both require and revert, (2) the Panic(uint256) payload produced by assert, (3) developer-defined custom error selectors with typed arguments, and (4) an empty payload by revert. An empirical study of 3866 verified contracts found that all the forms appear in the real-world smart contract code [9].
- Gas-Cost Trade-Offs: Developers often shorten or omit error messages to reduce deployment and runtime gas. Replacing a literal message, such as “Unauthorized”, with a four-byte custom error saves 200–400 gas per call at the expense of human-readability.
2.5. Standard Proposals on Error Messages
- Status Codes (ERC-1066): ERC-1066 [11] defines a set of status codes to return the result of the transaction execution, as in HTTP codes. To translate the code into a human-readable description, a client must maintain a lookup table.
- Custom Errors (ERC-838): To provide more informative error messages, ERC-838 [12] introduces a custom error definition to the ABI with named error types. It allows developers to define custom error messages that can be returned when a transaction reverts. However, decoding requires the client to know the error signature; otherwise, the error message remains raw data.
- Error Definitions on Tokens (ERC-6093): In ERC-6093 [13], standardized error definitions are proposed for well-known token standards (i.e., ERC-20 [4], ERC-721 [14], ERC-1155 [15]) in the custom error format. Still, users can receive consistent error messages only if every token contract implements the standard.
- Localized Error Messages (ERC-1444): Even if an error message is returned, it is generally in English and unsuitable for global users. ERC-1444 [16] suggests a lookup table mechanism that converts status codes or error identifiers to human-readable localized error messages in different languages.
3. Design Overview
- Which condition triggered the error (e.g., a balance check in require)?
- What message is given for the error (e.g., “ERC20: insufficient balance”)?
- Where in the code did the failure arise (e.g., inside transfer function)?
- Execution Environment Analysis. ErrorExplainer infers what was actually executed from user-level information. It first analyzes the user’s raw transaction data and then resolves the implementation contract address when a proxy pattern is detected, which is used for upgradability [17]. ErrorExplainer then retrieves the verified source code stored on a blockchain explorer, such as Etherscan [6]. Simultaneously, ErrorExplainer parses the calldata to recover the called function and its ABI-encoded arguments.
- Source Code Analysis. ErrorExplainer performs structural, intra-component, and inter-component analyses to extract every potential revert site together with its guard condition and metadata. Because smart-contract bytecode is immutable, the analysis results can be cached and reused until the proxy’s implementation address changes.
- Error Description Generation. ErrorExplainer matches the observed revert data against the pre-computed error contexts, identifies the exact failing statement, normalizes heterogeneous formats (strings or custom errors), and synthesizes a human-readable error description.
4. Execution Environment Analysis
4.1. Transaction Data Analysis
- TO Address: The destination address is read directly from the transaction object. Because the smart contract may use the proxy pattern, this address can reference only a forwarding proxy and the implementation logic is located at a different address from the TO address. The proxy analysis in Section 4.2 handles such cases.
- Function Selector: The EVM interprets the first four bytes of calldata as the function selector, derived from the Keccak-256 hash of the function signature. For example, “0xa9059cbb” identifies “transfer(address,uint256)”. Since a single transaction can call a single function, the function indicated by the function selector is regarded as the entry point of this execution of the transaction.
- Function Arguments: The remaining bytes of calldata encode the ABI-formatted arguments.
- Revert Reason: If execution aborts, smart contracts provide a binary-encoded payload for error messages. It can also be obtained during off-chain simulation via eth_call or eth_estimateGas, and on-chain via eth_sendTransaction [18].
4.2. Proxy Analysis
- EIP-1967 Pattern: EIP-1967 [20] is a standard for upgradeable proxy contracts, where the address of the storage slot is obtained from the hash value of ‘eip1967.proxy.implementation’ minus one.
- Transparent Proxy Pattern: This pattern is used by OpenZeppelin [21] and is the default pattern for TransparentUpgradeableProxy.
- Legacy ZeppelinOS Pattern: It is the pattern by ZeppelinOS [22] and is used in legacy contracts.
4.3. External Information Fetching
4.4. Building User Context
5. Source Code Analysis
5.1. Structure Analysis and Initialization
5.1.1. Component Graph Construction
5.1.2. Trace Analysis
- Call Graph Construction: Creates the call graph based on the function call relationships in the source code. The call graph is represented as where is the set of functions and represents call relationships. An edge indicates that invokes .
- Graph Reversal: Reverses the call graph to show reachability relationships explicitly. In the reversed call graph , every edge is reversed to become .
- Entry Point Identification: For each function , ErrorExplainer traverses to identify all external functions, denoted as , that can reach the given function f. The reachability mapping from each function to its set of reachable entry-point functions is given by
5.1.3. Error Context Initialization
5.2. Intra-Component Analysis
5.2.1. Normalized Error
- (type of the error message) indicates whether it involves no message, a plain string, or a custom error typeFor a custom error type, the full signature is provided by .
- (binary-encoded message) contains the encoded version of the error message. We define encoding functions for a string (i.e., ) and custom errors (i.e., ) as and , respectively. Otherwise (i.e., ), is also ⊥ because the statement has no error message. This field will be used in matching with the error message in ..
- (error message text) is a readable form of a string or custom error extracted from the source code.
- (error condition) is the source code’s expression that triggers the error, which is the immediate precondition of the transaction-reverting statement. It depends on whether the transaction-reverting statement is a conditional statement or not (e.g., require and revert, respectively). If we cannot find the error condition, it is ⊥.
- r (source-code reference) is for adding code snippets to the error description.
5.2.2. Statement Traversal
5.2.3. Lifting Error Contexts
- Custom Error Definitions. A custom error is declared with the error keyword. From the source code, ErrorExplainer extracts these declarations and parses them into a pair of name (n) and argument type list (). The resulting map is stored in and later queried during lifting to encode custom errors (Section 5.2.1).
- Error Conditions. For every transaction-reverting statement, ErrorExplainer identifies the predicate that triggers the error. Conditional constructs, such as require and assert, include their own predicates. However, unconditional constructs, such as revert and throw, do not have predicates. In such cases, the effective guard condition is determined from the preceding context. Specifically, the immediately preceding control-flow condition becomes the effective condition for these unconditional error statements.
- Lifting Rules. Traversing the statements in the component, ErrorExplainer applies the rules in Table 4 to lift normalized errors from the transaction-reverting statements, require, assert, revert, and throw.
- require (Rule 1a, 1b, 1c):Rule 1a. The require statement includes the error condition. If the error output is a literal string (e.g., “outString”), ErrorExplainer encodes it via an encoding function for strings, , as . For example,Rule 1b. If the second argument is a custom error , then ErrorExplainer sets the type of the error message as and encodes it by , which retrieves the full signature of the custom error from and returns the first four bytes that correspond to the custom error selector only.Rule 1c. If the error output is empty or omitted, then the three fields, , , and are represented as ⊥.
- assert (Rule 2)An assert statement acts as a predefined custom error of Panic(0x01) without explicit error messages. Thus, it is lifted as a custom error with and the given error condition .
- revert (Rule 3a, 3b, 3c)revert follows the same sub-rules as require for , , and , but it lacks an explicit predicate. Thus, the branch predicate in is used for . Additionally, if revert is used without any arguments, it is lifted as the same as throw (Rule 3c).
- throw (Rule 4)Legacy throw statements carry neither the error message nor the error condition. Thus, we lift it into an empty-output normalized error that inherits the error condition from .
5.2.4. Updating Error Context
5.3. Inter-Component Analysis
6. Error Description Generation
6.1. Matching Process
- Choose the Richest Error Context: Because information flows forward along , the deployed contract (i.e., the last vertex in topological order) contains the most complete context. We denote this context by
- Build Candidate Sets (Selector Filter): Two candidate pools are constructed from using the function selector observed at runtime ().
- Function-level set contains only the normalized errors that belong to the function whose selector exactly matches the call site:
- Trace-level set is a superset that includes errors located in any function transitively reachable from the entry point. This set is used if the function-level match fails:
- Compare Encoded Error Outputs (Output Filter): Each candidate carries a normalized error output e.. The binary error payload emitted on-chain is compared as follows:For custom errors, only the first four bytes (the error selector) are stable across deployments and therefore used in the comparison.
- Fallback to Trace-Level Candidates: If , the procedure is repeated with in place of . This handles the case where a proxy causes the initial selector match to fail.
- Emit the Final Match: The final set is chosen asA singleton presents an unambiguous error description. Multiple entries imply that the developer reuses identical messages or empty strings. In such cases, ErrorExplainer concatenates each field of the entries (denoted by the meet operator ⊓).
6.2. Error Description Construction
7. Evaluation
7.1. Research Questions
- RQ1 (Identification accuracy): How thoroughly does ErrorExplainer locate and extract all transaction-reverting statements from the source code of Solidity smart contracts?
- RQ2 (Information completeness): How rich is each extracted error context, and does it supply all data fields required for a useful description?
- RQ3 (Matching fitness): How unambiguously can ErrorExplainer map user-side error information to the correct error context?
7.2. Implementation
7.3. Dataset
- Keep only contract pairs whose mutation log explicitly includes the removal of at least one transaction-reverting statement.
- Discard any pair whose mutated contract no longer compiles.
- Runtime Cost. We measured the computational overhead of ErrorExplainer’s static analysis pipeline. On an Apple silicon M1 (10-core CPU, 32 GB RAM), the end-to-end analysis of a single contract takes, on average, 0.64 s with a standard deviation of 0.55 s across the entire evaluation set. Because blockchain networks typically operate with block times on the order of seconds, this latency is negligible for practical front-end integration and real-time error reporting. Furthermore, this analysis is required only once per contract, and the results can be cached and reused, as the smart contracts remain immutable for their lifetime.
7.4. RQ1: Identification Accuracy
7.5. RQ2: Information Completeness
7.6. RQ3: Matching Fitness
- Function-level matching. Let be the subset in which every pair of contexts differs both in the enclosing function and in the encoded error output:
- Trace-level matching. Let comprise contexts whose traces have disjoint entry points but still have distinct error outputs:
8. Discussion and Limitations
8.1. Practical Use Cases
8.2. Abuse of Semantic Error Analysis
8.3. Limitations
9. Related Work
9.1. Error Handling in Smart Contracts
9.2. Static and Dynamic Vulnerability Analysis
9.3. Datasets and Large-Scale Instrumentations
10. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jang, H.; Han, S.H. User experience framework for understanding user experience in blockchain services. Int. J. Hum.-Comput. Stud. 2022, 158, 102733. [Google Scholar] [CrossRef]
- Panicker, Y.; Soremekun, E.; Sun, S.; Chattopadhyay, S. End-user Comprehension of Transfer Risks in Smart Contracts. arXiv 2024, arXiv:2407.11440. [Google Scholar]
- Waas, M. Downsizing Contracts to Fight the Contract Size Limit. 2020. Available online: https://ethereum.org/en/developers/tutorials/downsizing-contracts-to-fight-the-contract-size-limit/ (accessed on 25 May 2025).
- Vogelsteller, F.; Buterin, V. ERC-20: Token Standard. 2015. Available online: https://eips.ethereum.org/EIPS/eip-20 (accessed on 25 May 2025).
- Xia, S.; He, M.; Song, L.; Zhang, Y. SC-Bench: A Large-Scale Dataset for Smart Contract Auditing. 2024. Available online: http://arxiv.org/abs/2410.06176 (accessed on 25 May 2025).
- Etherscan. Available online: https://etherscan.io (accessed on 25 May 2025).
- Wood, G. Ethereum: A Secure Decentralised Generalised Transaction Ledger. 2014. Available online: https://ethereum.github.io/yellowpaper/paper.pdf (accessed on 25 May 2025).
- Team, S.C. Solidity 0.8.25 Documentation: Error Handling. 2024. Available online: https://docs.soliditylang.org/en/v0.8.25/control-structures.html (accessed on 25 May 2025).
- Liu, L.; Wei, L.; Zhang, W.; Wen, M.; Liu, Y.; Cheung, S.C. Characterizing transaction-reverting statements in ethereum smart contracts. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, 15–19 November 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 630–641. [Google Scholar]
- Yan, K.; Zhang, X.; Diao, W. Stealing trust: Unraveling blind message attacks in web3 authentication. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, Salt Lake City, UT, USA, 14–18 October 2024; pp. 555–569. [Google Scholar]
- Zelenka, B.; Carchrae, T.; Naumenko, G. ERC-1066: Status Codes. 2018. Available online: https://eips.ethereum.org/EIPS/eip-1066 (accessed on 25 May 2025).
- Bond, F.; de Souza, R.R. ERC-838: ABI Specification for REVERT Reason String. 2020. Available online: https://eips.ethereum.org/EIPS/eip-838 (accessed on 25 May 2025).
- García, E.; Giordano, F.; Croubois, H. ERC-6093: Custom Errors for Commonly-Used Tokens. 2022. Available online: https://eips.ethereum.org/EIPS/eip-6093 (accessed on 25 May 2025).
- Entriken, W.; Shirley, D.; Evans, J.; Sachs, N. ERC-721: Non-Fungible Token Standard. 2018. Available online: https://eips.ethereum.org/EIPS/eip-721 (accessed on 25 May 2025).
- Radomski, W.; Cooke, A.; Castonguay, P.; Therien, J.; Binet, E.; Sandford, R. ERC-1155: Multi Token Standard. 2018. Available online: https://eips.ethereum.org/EIPS/eip-1155 (accessed on 25 May 2025).
- Zelenka, B.; Cooper, J. ERC-1444: Localized Messaging with Signal-to-Text. 2018. Available online: https://eips.ethereum.org/EIPS/eip-1444 (accessed on 25 May 2025).
- Meisami, S.; Bodell, W.E., III. A Comprehensive Survey of Upgradeable Smart Contract Patterns. 2023. Available online: https://arxiv.org/abs/2304.03405 (accessed on 25 May 2025).
- Ethereum Foundation. Ethereum JSON-RPC API Specification. 2024. Available online: https://ethereum.org/en/developers/docs/apis/json-rpc/ (accessed on 25 May 2025).
- Li, X.; Yang, J.; Chen, J.; Tang, Y.; Gao, X. Characterizing ethereum upgradable smart contracts and their security implications. In Proceedings of the ACM Web Conference 2024, Singapore, 13–17 May 2024; pp. 1847–1858. [Google Scholar]
- Palladino, S.; Francisco Giordano, H.C. EIP-1967: Upgradeable Proxy Storage Slots. 2019. Available online: https://eips.ethereum.org/EIPS/eip-1967 (accessed on 25 May 2025).
- OpenZeppelin. OpenZeppelin Contract Library. Available online: https://docs.openzeppelin.com/contracts/5.x/ (accessed on 25 May 2025).
- Nadolinski, E.; Spagnuolo, F. Proxy Patterns. 2018. Available online: https://blog.openzeppelin.com/proxy-patterns/ (accessed on 25 May 2025).
- Crytic. Slither, the Smart Contract Static Analyzer. 2024. Available online: https://github.com/crytic/slither (accessed on 25 May 2025).
- Ethereum Foundation. web3.py: A Python Library for Interacting with Ethereum. Available online: https://github.com/ethereum/web3.py (accessed on 25 May 2025).
- Liao, Z.; Hao, S.; Nan, Y.; Zheng, Z. SmartState: Detecting state-reverting vulnerabilities in smart contracts via fine-grained state-dependency analysis. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, Seattle, WA, USA, 17–21 July 2023; pp. 980–991. [Google Scholar]
- Jiang, B.; Liu, Y.; Chan, W.K. Contractfuzzer: Fuzzing smart contracts for vulnerability detection. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, France, 3–7 September 2018; pp. 259–269. [Google Scholar]
- Mitropoulos, C.; Kechagia, M.; Maschas, C.; Ioannidis, S.; Sarro, F.; Mitropoulos, D. Charting The Evolution of Solidity Error Handling. arXiv 2024, arXiv:2402.03186. [Google Scholar]
- Mitropoulos, C.; Kechagia, M.; Maschas, C.; Ioannidis, S.; Sarro, F.; Mitropoulos, D. Broken Agreement: The Evolution of Solidity Error Handling. In Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Barcelona, Spain, 24–25 October 2024; pp. 257–268. [Google Scholar]
- Luu, L.; Chu, D.H.; Olickel, H.; Saxena, P.; Hobor, A. Making smart contracts smarter. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 254–269. [Google Scholar]
- Mueller, B. Smashing ethereum smart contracts for fun and real profit. HITB SECCONF Amst. 2018, 9, 4–17. [Google Scholar]
Function | Error Condition | Description |
---|---|---|
transfer | Insufficient balance of sender | The sender account does not have enough balance to complete the transfer. |
recipient is address(0) | Transfers to the zero address are not allowed, preventing token loss. | |
approve | spender is address(0) | Approval to the zero address is disallowed to avoid confusion regarding allowance ownership. |
⋮ | ⋮ |
Returned Data to User | ErrorExplainer-Extracted Error Context |
---|---|
(binary) | Message: “ERC20: transfer amount exceeds balance” |
Origin: “_transfer” | |
Error condition: “senderBalance >= amount” | |
Code snippet: “require(senderBalance >= amount,⋯);” |
Pattern | Storage Slot |
---|---|
EIP-1967 | 0x360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc |
Transparent Proxy | 0x5c60da1b00000000000000000000000000000000000000000000000000000000 |
ZeppelinOS | 0x7050c9e0f4ca769c69bd3a8ef740bc37934f8e2c036e5a723fd8ee048ed3f8c3 |
Rule No. | Statement Pattern | Normalized Error |
---|---|---|
1a | ||
1b | ||
1c | ||
2 | ||
3a | ||
3b | ||
3c | ||
4 |
Field | Description | Matched Parameter |
---|---|---|
Error Message | Human-readable string that explains what went wrong and why. | . |
Origin | Enclosing component where the normalized error is defined. Origin names carry semantic cues (e.g., onlyOwner) to understand the error. | |
Error Condition | Boolean guard condition that triggers the revert. | . |
Source Code Snippet | Code excerpt around the transaction-reverting statement, useful for users or tools (e.g., LLMs). | .r |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, J. ErrorExplainer: Automated Extraction of Error Contexts from Smart Contracts. Appl. Sci. 2025, 15, 6006. https://doi.org/10.3390/app15116006
Lee J. ErrorExplainer: Automated Extraction of Error Contexts from Smart Contracts. Applied Sciences. 2025; 15(11):6006. https://doi.org/10.3390/app15116006
Chicago/Turabian StyleLee, JongHyup. 2025. "ErrorExplainer: Automated Extraction of Error Contexts from Smart Contracts" Applied Sciences 15, no. 11: 6006. https://doi.org/10.3390/app15116006
APA StyleLee, J. (2025). ErrorExplainer: Automated Extraction of Error Contexts from Smart Contracts. Applied Sciences, 15(11), 6006. https://doi.org/10.3390/app15116006