Property-Based Testing for Cybersecurity: Towards Automated Validation of Security Protocols
Abstract
:1. Introduction
- We provide a formal overview of property-based testing and its theoretical underpinnings, emphasizing its suitability for complex security-critical systems.
- We analyze the specific strengths of PBT in the cybersecurity context, particularly in input space exploration, fuzzing, and protocol validation.
- We present illustrative application scenarios and case studies, including authentication flows and cryptographic APIs, highlighting how PBT can uncover protocol design flaws.
- We introduce a real-world case study applying PBT to OAuth 2.0 and compare PBT to fuzzing and static analysis in terms of effectiveness and practicality.
- We discuss key challenges such as oracle design, scalability, and integration with CI/CD pipelines, offering concrete suggestions and implementation strategies.
- We propose future directions to extend the reach of PBT, including hybridization with formal verification and intelligent test generation through ML.
2. Background
2.1. Overview of Testing Methodologies in Cybersecurity
- Unit Testing—Testing individual components in isolation to verify their correctness.
- Integration Testing—Verifying the interaction between components.
- System Testing—Ensuring the system, as a whole, meets functional requirements.
- Penetration Testing—Simulating real-world attacks to evaluate security defenses.
- Formal Verification—Uses mathematical proof to verify the correctness of system models or implementations.
2.2. Introduction to Property-Based Testing
- Automated Input Generation: Test inputs are randomly or systematically generated using data generators.
- Property Checking: A property P(x) is evaluated for each generated input x. If a property fails, the failing input is shrunk to a minimal counterexample for debugging.
- Invariants:
- Preconditions and Postconditions:
3. Why Property-Based Testing for Security?
3.1. Strengths of Property-Based Testing
- Generative Testing with Constraints: Modern PBT frameworks support constrained and structured input generation. This is especially important for security, where the inputs must conform to specific formats (e.g., valid certificates, encrypted payloads, or session tokens) [12].
- Incremental Formalization: PBT enables developers to encode behaviors as mathematical properties, such as idempotence or state transitions, without requiring full formal models or theorem provers, effectively bridging the gap between informal testing and formal verification [3].
3.2. Security-Specific Motivations for Property-Based Testing
- Protocol Fuzzing with Structured Input: Traditional fuzzers use purely random input, which often leads to invalid states. PBT, in contrast, can generate structured fuzzing—i.e., input that is syntactically or semantically valid but still randomized. This increases code coverage while preserving meaningfulness [7], [14]. For example, property-based fuzzing has been used to validate cryptographic code even in the absence of oracles by using metamorphic properties [5].
- State Machines and Session Validation: Many security protocols (e.g., TLS, OAuth, and SSH) can be modeled as finite state machines. PBT can generate sequences of actions to verify that state transitions comply with protocol specifications. This enables the detection of logic flaws such as unauthorized transitions, inconsistent session handling, or privilege escalations [3].
- Cryptographic Assumptions as Properties: Cryptographic correctness properties can be encoded in PBT, such as:
- Flaw Discovery Without Oracles: One of the most compelling use cases of PBT in security is testing when expected outputs are hard to specify (e.g., for hash functions or cryptographic noise). PBT allows the specification of metamorphic or relational properties rather than exact outputs, which is a known technique for oracle-less validation [3,5].
- Real-World Application Example (Preview): In Section 4.8, we demonstrate these principles in practice by applying PBT to an OAuth 2.0 authorization flow. Using Hypothesis and structured test generators, we uncovered protocol design flaws and validated property enforcement without relying on fixed oracle outputs.
4. Case Studies/Application Domains
4.1. Functional Primitives and Property Fidelity
4.2. Secure Flows and Stateful Protocols (OAuth Preview)
- Testing state transitions (e.g., token issuance only after authorization).
- Validating token constraints (e.g., expiry, reuse, and integrity).
- Detecting invalid access paths (e.g., skipping login steps).
4.3. Access Control Verification via Role-Based Policies
4.4. TLS Handshake State Validation
4.5. Case Study: Buggy Encryption Interface
4.6. Case Study: TLS Transition Validation
- S: a set of protocol states (e.g., Start, ClientHello, ServerHello, Finished)
- A: a set of actions (e.g., sending/receiving messages)
- δ: S × A → S: a transition function
4.7. Case Study: OAuth Flow Validation
4.8. Real-World Implementation: OAuth 2.0 via Requests-OAuthlib
- No access to protected endpoints without a token.
- Proper expiration and refresh behavior.
4.9. Comparative Evaluation of PBT vs. Other Techniques
5. Tooling and Integration
5.1. Overview of Property-Based Testing Tools
- QuickCheck (Haskell): The original PBT framework, QuickCheck, pioneered input generation and shrinking for verifying functional correctness. It is particularly valued for its algebraic abstractions, making it ideal for protocol logic modeling and data structure invariants [20].
- Hypothesis (Python): Designed for ease of use, Hypothesis integrates seamlessly with Pytest and supports property-based testing in both unit and stateful scenarios. Its support for adaptive shrinking, custom strategies, and integration into CI/CD workflows makes it a strong choice for the rapid validation of web protocols, REST APIs, and cryptographic interfaces.
- PropEr (Erlang): Built for concurrency, PropEr excels in testing distributed systems and telecommunication protocols. It supports symbolic and state-based models, and its ability to simulate message passing and interleaved execution paths is especially relevant for secure networked applications.
5.2. Integration into Cybersecurity Pipelines
- Automated Test Generation: Utilize PBT tools to generate diverse test cases that challenge these properties [2].
5.3. Complementing Formal Verification and Static Analysis
- When to use PBT: Properties involving randomized input spaces, interaction sequences, or protocol behaviors.
- When to use Formal Methods: Cryptographic proofs, type systems, logic properties.
- Hybrid Use: Properties proven formally can also be tested empirically via PBT under real-world conditions.
6. Challenges and Limitations
6.1. Difficulty in Expressing Meaningful Security Properties
- Developing reusable libraries of security property templates to guide test authors and reduce duplication across projects.
- Integrating property recommendation features into IDEs and testing frameworks, helping developers define meaningful tests as they code.
- Exploring the use of natural language processing (NLP) to translate documentation or specifications into candidate properties, easing the property definition burden for non-experts.
6.2. The Oracle Problem in Security Testing
6.3. Performance Considerations in Large-Scale Systems
- Prioritizing property importance and test case likelihood.
- Parallelizing test execution (where supported).
- Applying resource caps during CI runs.
6.4. Addressing Non-Determinism, Side Channels, and Concurrency
- Concurrency: Testing all possible interleavings of thread or message interactions is intractable. PBT may expose some race conditions but cannot exhaustively traverse the state tree.
- Side Channels: Outputs like timing or cache access patterns cannot be captured through standard property definitions.
- Randomness: Cryptographic operations often introduce probabilistic behaviors that require carefully designed metamorphic relations.
- Interoperate with fuzzers and formal models.
- Capture non-functional properties like timing or entropy.
- Scale across distributed and asynchronous architectures.
7. Future Directions
7.1. Integration with Formal Methods and Property Definition Guidance
- Decompose high-level security goals into testable behavioral rules. For example, a goal like “access should only be granted after login” can be encoded as a state-based sequence property.
- Use preconditions and postconditions for stateful operations. In authentication flows, one might define that a session token is only valid if issued after a valid credential exchange.
- Adopt metamorphic property patterns where expected outputs are unknown. For cryptographic APIs, one might define that
- Leverage reusable patterns from the security domain. Role-based access control (RBAC), session validation, and token lifecycle checks can often be captured using templates such as:
- ○
- “User without role R should not access resource X”.
- ○
- “Token T must not be accepted after expiration timestamp”.
7.2. Intelligent Test Case Generation
- Broader input coverage.
- Domain-specific generation strategies.
- Reduced manual burden on test developers.
7.3. Standardization of Properties and Frameworks
- Domain-specific property repositories (e.g., for cryptographic APIs, auth flows).
- Unified property definition languages.
- Cross-tool compatibility standards.
7.4. Application in Security Testing
- Define expressive security properties.
- Simulate adversarial inputs.
- Detect logic flaws without an oracle.
7.5. Education and Tool Support
- IDE plugins for property suggestions.
- Step-by-step tutorials for writing PBTs in security contexts.
- Visualization tools for test-trace inspection and shrinking paths.
8. Conclusions
- Hybrid verification strategies that combine PBT with formal methods.
- Intelligent test generation via ML/NLP.
- Property libraries and standardization for security protocols.
- Improved education and developer support.
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Implementation Example
Appendix A.1. Property-Based Test Python Code
from hypothesis import given from hypothesis.strategies import binary # Buggy encryption and decryption functions def encrypt(data, key): return data[:-1] + b’x’ if len(data) > 1 else b’x’ def decrypt(ciphertext, key): return ciphertext[:-1] + b’y’ if len(ciphertext) > 1 else b’y’ # Property-based test @given(data=binary(min_size=1,max_size=256),key=binary(min_size=16, |
max_size=32)) |
def test_encrypt_decrypt_identity(data, key): assert decrypt(encrypt(data, key), key) == data |
Appendix A.2. Property-Based Test Output Log
Falsifying example: test_encrypt_decrypt_identity( data=b’\x10’, key=b’AAAAAAAAAAAAAAAA’ ) Traceback (most recent call last): File “test_script.py”, line 10, in test_encrypt_decrypt_identity assert decrypt(encrypt(data, key), key) == data AssertionError Shrunk example to smallest failing input. Minimal counterexample: data=b’\x10’ |
Appendix B. TLS Property-Based Test
Appendix B.1. TLS Property-Based Test Python Script
from hypothesis import given, strategies as st # Simplified TLS handshake states states = [“Start”, “ClientHello”, “ServerHello”, “Certificate”, “Finished”] # Rule: ’Finished’ must not occur before ’ServerHello’ def is_valid_transition(trace): seen = set() for state in trace: if state == “Finished” and “ServerHello” not in seen: return False seen.add(state) return True @given(st.lists(st.sampled_from(states), min_size=2, max_size=6)) def test_tls_handshake_trace(trace): assert is_valid_transition(trace) |
Appendix B.2. TLS Property-Based Test Output Log
Falsifying example: test_tls_handshake_trace(trace=[‘Finished’, ‘ServerHello’]) Traceback (most recent call last): File “tls_test.py”, line 13, in test_tls_handshake_trace assert is_valid_transition(trace) AssertionError Shrunk example: [‘Finished’, ‘ServerHello’] |
Appendix C. OAuth Property-Based Test
Appendix C.1. OAuth Property-Based Test Python Script
from hypothesis import given, strategies as st # OAuth 2.0 simplified flow states states = [“Start”, “RequestAuth”, “AuthCode”, “AccessToken”, “Resource”] # Rule: no access token without prior AuthCode, no Resource access without AccessToken def valid_oauth_flow(trace): seen = set() for step in trace: if step == “AccessToken” and “AuthCode” not in seen: return False if step == “Resource” and “AccessToken” not in seen: return False seen.add(step) return True @given(st.lists(st.sampled_from(states), min_size=2, max_size=6)) def test_oauth_flow(trace): assert valid_oauth_flow(trace) |
Appendix C.2. OAuth Property-Based Test Output Log
Falsifying example: test_oauth_flow(trace=[‘Resource’, ‘AccessToken’]) Traceback (most recent call last): File “oauth_test.py”, line 14, in test_oauth_flow assert valid_oauth_flow(trace) AssertionError Shrunk example: [’Resource’, ’AccessToken’] |
References
- Fu, Y.L.; Xin, X.L. A Model Based Security Testing Method for Protocol Implementation. Sci. World J. 2014, 2014, 632154. [Google Scholar] [CrossRef] [PubMed]
- Goldstein, H.; Cutler, J.W.; Dickstein, D.; Pierce, B.C.; Head, A. Property-Based Testing in Practice. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, Lisbon, Portugal, 14–20 April 2024; ACM: Lisbon, Portugal, 2024; pp. 1–13. [Google Scholar]
- Chen, Z.; Rizkallah, C.; O’Connor, L.; Susarla, P.; Klein, G.; Heiser, G.; Keller, G. Property-Based Testing: Climbing the Stairway to Verification. In Proceedings of the 15th ACM SIGPLAN International Conference on Software Language Engineering, Auckland, New Zealand, 6–7 December 2022; ACM: Auckland, New Zealand, 2022; pp. 84–97. [Google Scholar]
- Bates, M.; Near, J.P. DT-SIM: Property-Based Testing for MPC Security. arXiv 2024. [Google Scholar] [CrossRef]
- Fink, G.; Bishop, M. Property-based testing: A new approach to testing for assurance. SIGSOFT Softw. Eng. Notes 1997, 22, 74–80. [Google Scholar] [CrossRef]
- Beurdouche, B.; Bhargavan, K.; Delignat-Lavaud, A.; Fournet, C.; Kohlweiss, M.; Pironti, A.; Strub, P.-Y.; Zinzindohoue, J.K. A Messy State of the Union: Taming the Composite State Machines of TLS. In Proceedings of the 2015 IEEE Symposium on Security and Privacy, San Jose, CA, USA, 17–21 May 2015; pp. 535–552. [Google Scholar]
- Bennouk, K.; Ait Aali, N.; El Bouzekri El Idrissi, Y.; Sebai, B.; Faroukhi, A.Z.; Mahouachi, D. A Comprehensive Review and Assessment of Cybersecurity Vulnerability Detection Methodologies. J. Cybersecurity Priv. 2024, 4, 853–908. [Google Scholar] [CrossRef]
- Godefroid, P.; Levin, M.Y.; Molnar, D. SAGE: Whitebox Fuzzing for Security Testing: SAGE has had a remarkable impact at Microsoft. Queue 2012, 10, 20–27. [Google Scholar] [CrossRef]
- Rajapakse, R.N.; Zahedi, M.; Babar, M.A.; Shen, H. Challenges and solutions when adopting DevSecOps: A systematic review. Inf. Softw. Technol. 2022, 141, 106700. [Google Scholar] [CrossRef]
- Korir, F.C. Software security models and frameworks: An overview and current trends. World J. Adv. Eng. Technol. Sci. 2023, 8, 86–109. [Google Scholar] [CrossRef]
- Hoare, C.A.R. An axiomatic basis for computer programming. Commun. ACM 1969, 12, 576–580. [Google Scholar] [CrossRef]
- MacIver, D.R.; Hatfield-Dodds, Z.; Contributors, M.O. Hypothesis: A new approach to property-based testing. J. Open Source Softw. 2019, 4, 1891. [Google Scholar] [CrossRef]
- Vazou, N.; Bakst, A.; Jhala, R. Bounded refinement types. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, Vancouver, BC, Canada, 1–3 September 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 48–61. [Google Scholar]
- Zalewski, M. The Tangled Web: A Guide to Securing Modern Web Applications; No Starch Press: San Francisco, CA, USA, 2011; ISBN 978-1-59327-417-7. [Google Scholar]
- De Ruiter, J.; Poll, E. Protocol State Fuzzing of {TLS} Implementations. In Proceedings of the 24th USENIX Security Symposium (USENIX Security 15), Washington, DC, USA, 12–14 August 2015; pp. 193–206. Available online: https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/de-ruiter (accessed on 23 April 2025).
- Segura, S.; Fraser, G.; Sanchez, A.B.; Ruiz-Cortés, A. A Survey on Metamorphic Testing. IEEE Trans. Softw. Eng. 2016, 42, 805–824. [Google Scholar] [CrossRef]
- Zeller, A.; Gopinath, R.; Böhme, M.; Fraser, G.; Holler, C. The Fuzzing Book. Available online: https://www.fuzzingbook.org/ (accessed on 23 April 2025).
- Hu, V.C.; Ferraiolo, D.F.; Kuhn, D.R. Assessment of Access Control Systems; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2006. [Google Scholar] [CrossRef]
- Youn, D.; Lee, S.; Ryu, S. Declarative static analysis for multilingual programs using CodeQL. Softw. Pract. Exp. 2023, 53, 1472–1495. [Google Scholar] [CrossRef]
- MacIver, D.R.; Donaldson, A.F. Test-Case Reduction via Test-Case Generation: Insights from the Hypothesis Reducer (Tool Insights Paper). In Proceedings of the 34th European Conference on Object-Oriented Programming (ECOOP 2020), Berlin, Germany, 15–17 November 2020; Volume 166, pp. 13:1–13:27. [Google Scholar] [CrossRef]
- Cankar, M.; Petrovic, N.; Pita Costa, J.; Cernivec, A.; Antic, J.; Martincic, T.; Stepec, D. Security in DevSecOps: Applying Tools and Machine Learning to Verification and Monitoring Steps. In Proceedings of the Companion of the 2023 ACM/SPEC International Conference on Performance Engineering, Coimbra, Portugal, 15–19 April 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 201–205. [Google Scholar]
- Prates, L.; Pereira, R. DevSecOps practices and tools. Int. J. Inf. Secur. 2024, 24, 11. [Google Scholar] [CrossRef]
- Claessen, K.; Hughes, J. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming, Montreal, QC, Canada, 18–21 September 2000; Association for Computing Machinery: New York, NY, USA, 2000; pp. 268–279. [Google Scholar]
- Udeshi, S.; Chattopadhyay, S. Grammar Based Directed Testing of Machine Learning Systems. IEEE Trans. Softw. Eng. 2021, 47, 2487–2503. [Google Scholar] [CrossRef]
- Huang, L.; Zhao, P.; Chen, H.; Ma, L. Large Language Models Based Fuzzing Techniques: A Survey. arXiv 2024. [Google Scholar] [CrossRef]
- Cachin, C.; Guerraoui, R.; Rodrigues, L. Introduction to Reliable and Secure Distributed Programming; Springer: Berlin/Heidelberg, Germany, 2011; ISBN 978-3-642-15259-7. [Google Scholar]
- Ayenew, H.; Wagaw, M. Software Test Case Generation Using Natural Language Processing (NLP): A Systematic Literature Review. Artif. Intell. Evol. 2024, 5, 1–10. [Google Scholar] [CrossRef]
- Von Gugelberg, H.M.; Schweizer, K.; Troche, S.J. Experimental evidence for rule learning as the underlying source of the item-position effect in reasoning ability measures. Learn. Individ. Differ. 2025, 118, 102622. [Google Scholar] [CrossRef]
Metric | Value |
---|---|
Total tests run | 1000 |
Max input size | 256 bytes |
Failure detected | Yes |
Shrinking steps | 9 |
Minimal counterexample | data = b’\×10’ |
Time to detect failure | 1.3 s |
Metric | Value |
---|---|
Total tests run | 1000 |
State trace length | 2–6 states |
Failure detected | Yes |
Shrinking steps | 3 |
Minimal counterexample | [‘Finished’, ‘ServerHello’] |
Time to detect failure | ~1 s |
Metric | Value |
---|---|
Total tests run | 1000 |
Flow length | 2–6 steps |
Failure detected | Yes |
Shrinking steps | 3 |
Minimal counterexample | [‘Resource’, ‘AccessToken’] |
Time to detect failure | ~1.2 s |
Scenario | Method | Detection Rate | False Positives | Time to Detect |
---|---|---|---|---|
TLS State Logic | PBT | High | Low | ~1 s |
OAuth Flow Violation | PBT | High | Low | ~1.2 s |
OAuth Flow | Fuzzing (Radamsa) | Low | High | <1 s |
Crypto API | Static Analysis | Medium | Low | Slow |
Tool | Language | Key Features | Strengths | State Machine Support | Symbolic Execution | CI/CD Integration |
---|---|---|---|---|---|---|
QuickCheck | Haskell | Generators, shrinking | Pioneer in PBT, strong abstractions | Partial | No | Limited |
Hypothesis | Python | Stateful testing, smart shrinking | Easy integration with Pytest | Yes | No | Excellent (GitHub Actions (v3+), GitLab CI (v14+)) |
PropEr | Erlang | Stateful models, concurrency support | Good for telecom, distributed apps | Yes | Yes | Moderate |
Direction | Focus Area | Benefit | Key References |
---|---|---|---|
Integration with Formal Methods | Bridging proofs and generative testing | Higher assurance and broader applicability | [3] |
ML/NLP-Based Generators | Smart input generation | Broader coverage, fewer manual generators | [25] |
Standardization and Reuse | Common property libraries, unified tooling | Faster adoption, consistent practice | [2] |
Security Testing Applications | Applying PBT to vulnerability discovery | Detecting edge-case bugs and misconfigurations | [26] |
Education and Tooling | Training, IDE integration, usability | Lower barrier to entry, broader community engagement | [2,27,28] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Reis, M.J.C.S. Property-Based Testing for Cybersecurity: Towards Automated Validation of Security Protocols. Computers 2025, 14, 179. https://doi.org/10.3390/computers14050179
Reis MJCS. Property-Based Testing for Cybersecurity: Towards Automated Validation of Security Protocols. Computers. 2025; 14(5):179. https://doi.org/10.3390/computers14050179
Chicago/Turabian StyleReis, Manuel J. C. S. 2025. "Property-Based Testing for Cybersecurity: Towards Automated Validation of Security Protocols" Computers 14, no. 5: 179. https://doi.org/10.3390/computers14050179
APA StyleReis, M. J. C. S. (2025). Property-Based Testing for Cybersecurity: Towards Automated Validation of Security Protocols. Computers, 14(5), 179. https://doi.org/10.3390/computers14050179