Heuristics Analyses of Smart Contracts Bytecodes and Their Classifications
Abstract
1. Introduction
2. Background and Literature Review
2.1. Blockchain Concepts
2.2. Literature Review
3. Methodology
3.1. Method for Standard Token Contracts Identification
- ERC20: totalSupply(), balanceOf(address), transfer(address, uint256), allowance(address, address), approve(address, uint256), transferFrom(address, address, uint256)
- ERC721: balanceOf(address), ownerOf(uint256), getApproved(uint256), setApprovalForAll(address, bool), isApprovedForAll(address, address), transferFrom(address, address, uint256), safeTransferFrom(address, address, uint256), safeTransferFrom(address, address, uint256, bytes)
- ERC1155: balanceOf(address, uint256), balanceOfBatch(address[], uint256[]), setApprovalForAll(address, bool), isApprovedForAll(address, address), safeTransferFrom(address, address, uint256, uint256, bytes), safeBatchTransferFrom(address, address, uint256[], uint256[], bytes)
3.2. Method for Non-Standard Contracts Identification
3.3. Method for Evaluation
4. Design and Implementation of Classifier
4.1. Architecture of Smart Contract Classifier
4.2. PoC Implementation
4.3. Evaluation Results of the Smart Contract Classifier
5. Risk Analyses and Mitigation Strategies
5.1. Risk Analyses
5.2. Risk Mitigation with Simulation of Smart Contract Function Behaviour Using Symbolic Execution Environment
5.2.1. Symbolic Execution Verification Process of Standard Contract Functions
| Algorithm 1 Verify ERC20 Transfer Balance Delta |
|
5.2.2. Simulation Parameters and Algorithm for ERC20 Transfer Function
- Notations used for the simulation description:
- C: contract bytecode under test
- : pre-state (storage)
- : post-state (storage)
- : path constraints from symbolic execution
- : symbolic execution of a call, producing terminal states
- : abstract accessor for the balance mapping at address a in state
- : 4-byte function selector
- : ABI encoding of parameters (address, uint256)
- Simulation input Parameters and Preconditions:
- Initial State of the relevant variables:
- Call Data and Context:
6. Classifier Application and Analyses of Historical Contracts
6.1. High-Level Analyses of Deployed Smart Contracts
6.2. In-Depth Pattern Analyses of Smart Contract Categories
7. Discussion
7.1. Research Implication and Patterns from the Deployed Smart Contracts
7.2. Comparison of ML Approach and Heuristics in Token Contracts Identification
7.3. Beyond Main Classification Categories
7.4. Risk Implications
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Suiche, M. Porosity: A decompiler for blockchain-based smart contracts bytecode. DEF Con 2017, 25, 1–29. [Google Scholar]
- Grech, N.; Brent, L.; Scholz, B.; Smaragdakis, Y. Gigahorse: Thorough, declarative decompilation of smart contracts. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), Montreal, QC, Canada, 25–31 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1176–1186. [Google Scholar]
- Etherscan. Verified Contracts. Available online: https://etherscan.io/contractsverified (accessed on 13 June 2025).
- Udokwu, C.; Kormiltsyn, A.; Thangalimodzi, K.; Norta, A. The state of the art for blockchain-enabled smart-contract applications in the organization. In Proceedings of the 2018 Ivannikov Ispras Open Conference (ISPRAS), Moscow, Russia, 22–23 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 137–144. [Google Scholar]
- Li, X.; Chen, T.; Luo, X.; Zhang, T.; Yu, L.; Xu, Z. Stan: Towards describing bytecodes of smart contract. In Proceedings of the 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), Macau, China, 11–14 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 273–284. [Google Scholar]
- Di Angelo, M.; Salzer, G. Assessing the similarity of smart contracts by clustering their interfaces. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 29 December 2020–1 January 2021; IEEE: Piscataway, NJ, USA, 2020; pp. 1910–1919. [Google Scholar]
- Sezer, S.; Eyhoff, C.; Prinz, W.; Rose, T. Exploiting Smart Contract Bytecode for Classification on Ethereum. In Proceedings of the PoEM Workshops, Riga, Latvia, 26 November 2020; pp. 11–22. [Google Scholar]
- Ortu, M.; Ibba, G.; Destefanis, G.; Conversano, C.; Tonelli, R. Taxonomic insights into ethereum smart contracts by linking application categories to security vulnerabilities. Sci. Rep. 2024, 14, 23433. [Google Scholar] [CrossRef] [PubMed]
- Di Angelo, M.; Salzer, G. Identification of token contracts on Ethereum: Standard compliance and beyond. Int. J. Data Sci. Anal. 2023, 16, 333–352. [Google Scholar] [CrossRef]
- Fekih, R.B.; Lahami, M.; Jmaiel, M.; Bradai, S. Formal modeling and verification of erc smart contracts: Application to nft. In Proceedings of the 2023 IEEE Symposium on Computers and Communications (ISCC), Gammarth, Tunisia, 9–12 July 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 556–561. [Google Scholar]
- Chirtoaca, D.; Ellul, J.; Azzopardi, G. A framework for creating deployable smart contracts for non-fungible tokens on the ethereum blockchain. In Proceedings of the 2020 IEEE International Conference on Decentralized Applications and Infrastructures (DAPPS), Oxford, UK, 3–6 August 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 100–105. [Google Scholar]
- Long, H.W.; Si, Y.W. Token Fungibility Duality: Technical and Graphical Analysis on 404 Standards. In Proceedings of the 2024 IEEE International Conference on Blockchain (Blockchain), Copenhagen, Denmark, 19–22 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 252–259. [Google Scholar]
- Jensen, J.R.; von Wachter, V.; Ross, O. An introduction to decentralized finance (defi). Complex Syst. Inform. Model. Q. 2021, 26, 46–54. [Google Scholar] [CrossRef]
- Ding, W.W.; Liang, X.; Hou, J.; Wang, G.; Yuan, Y.; Li, J.; Wang, F.Y. Parallel governance for decentralized autonomous organizations enabled by blockchain and smart contracts. In Proceedings of the 2021 IEEE 1st International Conference on Digital Twins and Parallel Intelligence (DTPI), Beijing, China, 15 July–15 August 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–4. [Google Scholar]
- Ou, W.; Huang, S.; Zheng, J.; Zhang, Q.; Zeng, G.; Han, W. An overview on cross-chain: Mechanism, platforms, challenges and advances. Comput. Netw. 2022, 218, 109378. [Google Scholar] [CrossRef]
- Bodell, W.E., III; Meisami, S.; Duan, Y. Proxy hunting: Understanding and characterizing proxy-based upgradeable smart contracts in blockchains. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 1829–1846. [Google Scholar]
- Zou, W.; Lo, D.; Kochhar, P.S.; Le, X.B.D.; Xia, X.; Feng, Y.; Chen, Z.; Xu, B. Smart contract development: Challenges and opportunities. IEEE Trans. Softw. Eng. 2019, 47, 2084–2106. [Google Scholar] [CrossRef]
- Shi, C.; Xiang, Y.; Yu, J.; Gao, L.; Sood, K.; Doss, R.R.M. A bytecode-based approach for smart contract classification. In Proceedings of the 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Honolulu, HI, USA, 15–18 March 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1046–1054. [Google Scholar]
- Tian, G.; Wang, Q.; Zhao, Y.; Guo, L.; Sun, Z.; Lv, L. Smart contract classification with a bi-lstm based approach. IEEE Access 2020, 8, 43806–43816. [Google Scholar] [CrossRef]
- El Haddouti, S.; Khaldoune, M.; Ayache, M.; Ech-Cherif El Kettani, M.D. Smart contracts auditing and multi-classification using machine learning algorithms: An efficient vulnerability detection in ethereum blockchain. Computing 2024, 106, 2971–3003. [Google Scholar] [CrossRef]
- AlSobeh, A.; Shatnawi, A.; Magableh, A. AspectFL: Aspect-Oriented Programming for Trustworthy and Compliant Federated Learning Systems. Information 2025, 16, 1048. [Google Scholar] [CrossRef]





| Approach | Method | Data Type | Contract | Categories | Evaluation |
|---|---|---|---|---|---|
| ML [5] | NLP | Static (Bytecode) | Non-standard | Function descriptions | Informal (interviews) |
| ML [7] | NB, SVM, LR, RF, XGBoost | Static (Bytecode) | Non-standard | Voting, Auction, Entity management, Renting, Trading | Formal (MCC Score) |
| Heuristic [9] | rule-based | Static (Bytecode) | Standard | ERC20, ERC721 | Formal (F1 Score) |
| ML [18] | BPSO, AdaBoost | Static (Bytecode) | Non-standard | Governance, Finance, Gambling, Game, Wallet, Social | Formal (F1 Score) |
| ML [8] | Unsupervised LDA | Static (Code) | Non-standard | Notary, Token, Game, Financial, Blockchain interaction | Formal (Cohen’s kappa) |
| ML [19] | LSTM, LDA | Static (Code) | Non-standard | Entertainment, Tools, Management, Finance, Lottery, IoT | Formal (F1 Score) |
| ML [6] | Unsupervised weighted graphs | Static (Bytecode) | Standard, Non-standard | All ERC tokens, Multi-sig Wallets types | Formal (jaccard similarity) |
| ML [20] | RF, KNN | Static (Bytecode) | Non-standard | Reentrancy, Integer overflow, Tx-Ordering, transaction origin use, Unchecked suicide | Formal (F1 Score) |
| Type | Classifier | Precision | Recall | F1-Score | Ave. F1 |
|---|---|---|---|---|---|
| Standard | ERC20 | 100 | 96 | 98 | 99 |
| ERC721 | 100 | 100 | 100 | ||
| ERC1155 | 100 | 100 | 100 | ||
| Non-standard | DeFi | 88 | 85 | 88 | 93 |
| Governance | 88 | 96 | 92 | ||
| Cross-chain | 96 | 96 | 96 | ||
| Proxy | 100 | 90 | 95 |
| Risk ID | Vulnerability Description | Exploit Summary | Impact Summary | Likelihood | Severity | Risk Score |
|---|---|---|---|---|---|---|
| R1 | Four-Byte Hash Collisions | Collision-based function mimicry | Misclassification; financial & reputational loss | High | Critical | Critical |
| R2 | Incomplete Feature Set/Rule Coverage | Novel patterns bypassing rules | Misclassification; blind spots; reduced utility | Medium | High | High |
| R3 | Static Analysis Constraints | Runtime-dependent logic bypassing static analysis | Undetected vulnerabilities; false security | Medium | High | High |
| R4 | Fake Function Headers/Semantic Misdirection | Correct fingerprints with malicious logic | Deception; financial fraud risk | High | Critical | Critical |
| R5 | Function Fingerprint Obfuscation | Assembly or proxies for obfuscation | Evasion; misclassification | Medium | High | High |
| R6 | Data Freshness and Completeness | Crawler delays causing stale data | Delayed classification; reduced utility; inefficiencies | Medium | Medium | Medium |
| R7 | Manual Verification Bottlenecks/Ground Truth Reliability | Dependence on small biased dataset | Overestimation; misinformed trust | High | Medium | High |
| Risk ID | Vulnerability Description | Mitigation Summary | Implemented |
|---|---|---|---|
| R1 | Four-Byte Hash Collisions | Extended fingerprinting; semantic analysis; hybrid ML | Extended fingerprinting |
| R2 | Incomplete Feature Set/Rule Coverage | Hybrid ML; continuous monitoring; rule refinement | Continuous monitoring, Manual rule refinement |
| R3 | Static Analysis Constraints | Selective dynamic analysis; Semantic analysis (symbolic execution), vulnerability assessment tool | Symbolic execution (ERC functions) |
| R4 | Fake Function Headers/Semantic Misdirection | Semantic analysis; extended fingerprinting; hybrid ML | Symbolic execution (ERC functions) |
| R5 | Function Fingerprint Obfuscation | Extended fingerprinting; semantic analysis; hybrid ML | Symbolic execution (ERC functions) |
| R6 | Data Freshness and Completeness | Enhanced crawler scalability; performance monitoring | Enhanced crawler scalability |
| R7 | Manual Verification Bottlenecks/Ground Truth Reliability | Enhanced data sourcing & ground truth generation (multi-source, semi-automated), Continuous performance monitoring | Continuous performance monitoring |
| TokenContract | Ref [8] | Out1 | Our Meth. | Out2 | Explanation |
|---|---|---|---|---|---|
| 0x0077d27cb82ff12322987b225bfce0bb6e8931b4 | 1 | TP | 1 | TP | |
| 0x0079453f683380c7493d4bc4fa9baac97c5e693c | 1 | FP | 0 | TN | Non implemented erc20 function |
| 0x007bead59a807eb50aef56e80e3aecbab9a3026e | 1 | TP | 1 | TP | |
| 0x007da60ea2a53c09f5cdb1b5339d8cebe4409744 | 1 | TP | 1 | TP | |
| 0x007dfb0c30f55ccac0191387fe5ffc9cfde519c0 | 1 | TP | 1 | TP | |
| 0x0080cfc1b3177a45a4459b2e85cd202c26b37eb9 | 1 | TP | 1 | TP | |
| 0x008a548284F2E66A1150f4306492b0f5d82b3283 | 1 | TP | 1 | TP | |
| 0x008da6dfe18c61844d614294932d52c50323d722 | 1 | TP | 1 | TP | |
| 0x008eeef21c0dab336deba4c89d449c5e2593463d | 1 | TP | 1 | TP | |
| 0x008f1d94ad209a5cc9439BA515f619F1d015412e | 1 | TP | 1 | TP | |
| 0x0095a819919f3409e58128304b8b2b06b29e77be | 1 | TP | 1 | TP | |
| 0x0099686345e611F4c7646aaba8BCC535e150C20E | 1 | TP | 1 | TP | |
| 0x009c43B42AEFAC590C719E971020575974122803 | 1 | TP | 1 | TP | |
| 0x009fa1ebc188022c4391c69ef63f1323d358e987 | 1 | TP | 1 | TP | |
| 0x00A55375002f3cDa400383F479e7Cd57Bad029A9 | 1 | TP | 1 | TP | |
| 0x00E3c1F30dC416dBF841435cB1b2188c1A268F7E | 1 | TP | 1 | TP | |
| 0x00E9303e0fA754751C417E33FdBC031F0cc01360 | 1 | FP | 0 | TN | Non implemented erc20 function |
| 0x00EAeA176307159B928CCD4A8b9b33c2955092Db | 1 | FP | 0 | TN | Non implemented erc20 function |
| 0x00a73102c76647055e8b93f3d662cab686e5638e | 1 | TP | 1 | TP | |
| 0x00a9a70b94fc1f97141f99d90a3471cf49edadd9 | 1 | TP | 1 | TP | |
| 0x00a8b738e453ffd858a7edf03bccfe20412f0eb0 | 1 | TP | 1 | TP | |
| 0x00c3a4ea499cf8a68f26ec78fad0bd2be28c2769 | 1 | TP | 1 | TP | |
| 0x00d14753f126286502a3aa6df97a949a951398c9 | 1 | TP | 1 | TP | |
| 0x0107d006806d07d32efe5fad1c68b7b63b90e08c | 1 | FP | 0 | TN | Implemented only transfer function |
| 0x0114622386c1a00686e594c70682d7aa0f8afa29 | 1 | TP | 1 | TP |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Udokwu, C.; Mirhosseini, S.A.M.; Craß, S. Heuristics Analyses of Smart Contracts Bytecodes and Their Classifications. Electronics 2026, 15, 41. https://doi.org/10.3390/electronics15010041
Udokwu C, Mirhosseini SAM, Craß S. Heuristics Analyses of Smart Contracts Bytecodes and Their Classifications. Electronics. 2026; 15(1):41. https://doi.org/10.3390/electronics15010041
Chicago/Turabian StyleUdokwu, Chibuzor, Seyed Amid Moeinzadeh Mirhosseini, and Stefan Craß. 2026. "Heuristics Analyses of Smart Contracts Bytecodes and Their Classifications" Electronics 15, no. 1: 41. https://doi.org/10.3390/electronics15010041
APA StyleUdokwu, C., Mirhosseini, S. A. M., & Craß, S. (2026). Heuristics Analyses of Smart Contracts Bytecodes and Their Classifications. Electronics, 15(1), 41. https://doi.org/10.3390/electronics15010041

