Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps

Huang, Gang-Cheng; Lai, Tai-Hung

doi:10.3390/app152211862

Open AccessArticle

Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps

by

Gang-Cheng Huang

¹

and

Tai-Hung Lai

^2,*

¹

Department of Computer Science and Information Engineering, China University of Technology, Taipei 116, Taiwan

²

Department of Computer Science and Information Engineering, Chung Cheng Institute of Technology, National Defense University, Taoyuan 335009, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(22), 11862; https://doi.org/10.3390/app152211862

Submission received: 18 October 2025 / Revised: 5 November 2025 / Accepted: 6 November 2025 / Published: 7 November 2025

(This article belongs to the Special Issue Advanced Technologies in Data and Information Security, Fourth Edition)

Download

Browse Figures

Versions Notes

Abstract

Consistent detection of malicious loaders across varied programming languages and build tools remains a significant cybersecurity challenge. This study empirically measures how compiler and language choices affect the detectability of standard in-memory Windows loaders. We implement functionally equivalent loaders (allocate, copy, protect, execute) in C, C#, Fortran, and COBOL, embedding an identical x64 test payload to isolate behavior. Our results reveal significant detection gaps: loaders compiled in legacy languages (Fortran, COBOL) consistently evade static and dynamic antivirus engines that easily flag their C and C# counterparts. We demonstrate this evasion is not due to behavioral differences, but to compiler-specific static artifacts. These artifacts, such as interleaved zero-bytes in Fortran and fragmented payload-construction logic in COBOL, effectively break common signature matching. These findings indicate that many detection tools are overly sensitive to the static build surface rather than true semantic behavior. We provide actionable guidance favoring behavior-focused analysis, such as tracking API call order and memory protection changes, to address this critical legacy code blind spot.

Keywords:

malware; meterpreter; Fortran; COBOL; evasion anti-virus

1. Introduction

The influence of programming languages on antivirus evasion is a critical area of study in malware research [1]. Different languages offer varying degrees of control over system resources, which attackers leverage to bypass antivirus detection. Low-level languages such as C, C++, and Assembly allow for precise memory manipulation and direct interaction with the operating system [2], facilitating techniques such as packing [3], obfuscation [4], and polymorphism [5,6]. These methods effectively evade traditional signature-based detection, as they manipulate the structure of the executable, making it harder for antivirus software to recognize [7].

Recent trends in malware development include the use of multi-language attacks, where different components of the malware are written in various languages such as Go and Rust. As malware developers explore the strengths and weaknesses of various programming languages, the interplay between language choice and antivirus evasion strategies remains a critical area of ongoing research. Understanding these dynamics is essential for advancing both offensive and defensive cybersecurity measures, as well as for developing more resilient antivirus solutions capable of countering increasingly complex threats.

The impact of programming languages on antivirus evasion is crucial for understanding the current trends and predicting future developments in malware. As languages evolve, so do the tactics employed by attackers, necessitating continuous research into the potential malicious exploitation of these languages. The need for advanced detection systems incorporating behavioral analysis and machine learning becomes increasingly apparent as malware grows more sophisticated and linguistically diverse [8]. This research aimed to bridge the knowledge gap by examining how different programming languages affect malware’s evasion capabilities and exploring the implications for the future of cybersecurity.

The principal contributions of this study are summarized as follows:

Cross-language, cross-compiler measurement. We conduct a controlled, functionally equivalent implementation of canonical in-memory shellcode loaders across multiple languages (e.g., C, C#, Fortran, and COBOL) and compiler/linker configurations, keeping the payload and loader logic fixed to isolate the effect of language and toolchain choices.
Generalizable detection signals. We identify language-agnostic behavioral cues—API-call sequences, memory-protection transitions (VirtualAlloc → VirtualProtect), thread-spawn patterns—and demonstrate their relationship to typical AV detection points.
Actionable guidance for anti-virus. We analyze failure modes that arise with legacy or non-mainstream toolchains and outline practical hardening guidelines, including language-aware binary normalization and behavior-centric triage that reduces dependence on byte-patterns.
Reproducible baseline. We provide a reproducible build recipe and evaluation protocol that can serve as a reference baseline for future studies incorporating packers, encoders, or evasive techniques.

This work is academic and defensive research only. All studies use inert payload simulators to recreate allocation–copy–protection–dispatch semantics without network or file I/O in offline, isolated contexts. Instead of executable binaries or weaponizable payloads, exploits, packers, or obfuscation, build recipes, metadata, and aggregate results are employed for qualified replication. Laws, institutional regulations, and responsible research standards are followed in the study. To exploit these concepts or artifacts for offensive or illegal purposes is strictly banned. Readers must ensure ethical, legal use in their areas and organizations.

This paper is organized as follows: Section 2 reviews prior work on shellcode detection, language-aware malware analysis, and behavioral models for in-memory execution. Section 3 details our experimental design and the implementation of functionally equivalent loaders across languages, together with the compiler/linker matrix. Section 4 describes the scanning protocol, metrics, and statistical methods. Section 5 reports the empirical results and answers the research questions. Section 6 discusses implications, limitations, and threats to validity concludes the paper and outlines directions for the future work.

2. Related Work

In the realm of antivirus evasion, programming languages play a pivotal role in shaping malware’s characteristics and its ability to bypass detection systems. Traditionally, malware has been written in low-level languages such as C and C++ due to their direct access to system resources [9,10], allowing attackers to exploit memory management and hardware manipulation. These languages provide a high degree of control over the executable’s structure, enabling techniques such as packing, polymorphism, and self-modifying code, all of which are designed to evade static signature-based detection by antivirus engines [11]. In particular, malware written in these languages often utilizes advanced obfuscation methods to thwart reverse engineering and evade heuristic analysis.

More recently, high-level languages such as Python, Java, and C# have also come to play a significant role in antivirus evasion, albeit through different mechanisms. While these languages abstract much of the low-level memory management, they provide powerful libraries and frameworks that can be used to dynamically load and execute code at runtime [12]. For instance, the researchers have demonstrated the use of Python to evade antivirus detection without relying on obfuscation techniques [13]. Python-based malware can evade antivirus detection through various methods, including encryption, compression, and the use of packers such as PyInstaller [14,15]. Research indicates that, when Python is combined with these evasion mechanisms and incorporates obfuscation techniques, it poses a significant challenge to dynamic analysis engines, particularly in evading runtime detection. Further advanced obfuscation strategies have also been explored, such as utilizing fountain codes to fragment payloads [16] or applying chaotic-based encryption to randomize shellcode [17].

Programming languages such as Rust have begun to attract attention in the context of antivirus evasion. Rust’s memory safety features, combined with its ability to generate efficient binaries, make it more challenging for antivirus systems to perform reverse engineering and detect malicious behavior [18]. Studies have demonstrated that malware written in Rust tends to evade detection more effectively than C-based malware, highlighting the limitations of current antivirus tools in analyzing newer languages [19]. This emphasizes the need for antivirus systems to evolve, integrating more advanced detection mechanisms such as behavioral monitoring and machine learning to address the increasing diversity of malware development languages [20].

Cybercriminals are increasingly leveraging Microsoft’s .NET and PowerShell frameworks to develop sophisticated malware [21]. These powerful tools make it easier to develop ransomware and targeted attacks, presenting significant challenges for defenders. The rapid development cycles and high-level programming environments associated with these frameworks underscore the need for greater and continuous attention in combating these evolving threats [22].

Advanced threat groups, such as APT28 (Fancy Bear) and APT29 (Cozy Bear) [23], have shifted to using more obscure programming languages such as Go and Rust in their malware to enhance evasion [24]. Go, in particular has become a favorite in advanced persistent threats (APTs) due to its cross-platform compatibility and the difficulty it presents for reverse engineering. For instance, APT28 used Go to rewrite its Zebrocy malware, previously built in Delphi [25], while APT29 deployed Go in its WellMess malware to target both Windows and Linux systems. The WellMess malware even added capabilities such as running PowerShell scripts post-infection [24]. These adaptations significantly sidestep traditional antivirus detection methods, emphasizing the need for more advanced security solutions.

According to the TIOBE index in 2024, Python is the most popular programming language [26], followed by C++, Java, and C. This ranking reflects current trends in the software development industry, with Python’s versatility, simplicity, and widespread use in fields such as data science, web development, and automation contributing to its top position. Other languages such as Go and Rust are also gaining traction, indicating a growing interest in modern, efficient, and performance-oriented languages. In light of these trends and concerns, we propose an experiment focusing on shellcode loaders without any obfuscation, encryption, or encoding. This approach aims to provide a baseline understanding of how different programming languages, both modern and legacy, interact with antivirus systems in their most basic form. To provide a clear overview of the discussed literature as Table 1 presents a comparative analysis summarizing the influence of these language families on malware evasion.

3. Methodology

We implement functionally equivalent loaders in C, C#, Fortran, and COBOL to compare their antivirus detection outcomes under a fixed in-memory execution semantics [27]. To avoid distributing weaponizable artifacts, we use a benign inert payload emulator that exercises the same memory-allocation, copying, permission-transition, and thread-creation primitives as a typical loader, but performs no network or file I/O. This design preserves detection-relevant behavior while eliminating exploit content. All builds are produced with pinned toolchains and deterministic settings where available.

msfvenom -p windows/x64/meterpreter/reverse_tcp LHOST=<IP> LPORT=<PORT> -f c

This command utilizes msfvenom, a payload generator from the Metasploit Framework, to create a Meterpreter reverse shell payload for Windows in C. A reverse shell is a cybersecurity technique where a compromised system initiates a connection back to an attacker’s machine [28,29]. This approach allows the attacker to remotely execute commands on the compromised system. Once the payload is executed on the target machine, it initiates a connection back to the attacker’s system. This connection provides remote access to the compromised machine, allowing the attacker to interact with it as if they had access.

On Windows operating systems, shellcode is typically executed by allocating memory, copying the shellcode into that memory, and then marking it as executable before jumping to it. This process often involves using Windows API functions such as VirtualAlloc to allocate memory, RtlMoveMemory or memcpy to copy the shellcode, and CreateThread or direct function pointers to execute it. The first step in executing shellcode on Windows is allocating a block of memory that can hold the shellcode. This is performed using the VirtualAlloc function, which reserves or commits a region of pages in the virtual address space of the calling process [30]. The memory must be allocated with the PAGE_EXECUTE_READWRITE, instead of 0x40, flag to ensure that the memory is executable after writing the shellcode. Once the memory is allocated, the shellcode is copied into this allocated space using functions such as RtlMoveMemory or memcpy [31]. These functions move the bytes of shellcode from one memory location (likely the shellcode stored as a byte array in the program) to the newly allocated executable memory. If the memory was initially allocated with read/write permissions, it needs to be changed to executable memory using VirtualProtect [32]. This function changes the protection of the allocated memory, allowing the system to execute code stored in that region. After the memory is marked as executable, the final step is to execute the shellcode. This is commonly achieved by either creating a new thread with CreateThread that starts executing at the location of the shellcode or by casting the shellcode’s memory address to a function pointer and invoking it directly. After creating a thread with CreateThread, the function WaitForSingleObject is used to wait for the thread to finish. It ensures that the main program pauses until the shellcode execution is complete. The function takes the thread handle and INFINITE as arguments to wait indefinitely until the thread finishes.

All binaries are built in isolated VM (Virtual Machine) with pinned compilers/linkers; we record toolchain versions, options, and PE/CLR metadata hashes. Specifically, all build and execution tasks were conducted within a VMware Workstation 17.6.3 virtual machine running a 64-bit Windows 10 Professional guest OS. To create the isolated sandbox, the VM’s network was configured to a host-only (internal) setting. This setup permitted VM-to-VM communication (required for the inert payload’s C2 simulation) but strictly disabled all external file, registry, and network egress, preventing any external I/O.

4. Experiment

This section reports the cross-language scanning results and the associated analysis. To isolate the effect of language and toolchain choices, all artifacts implement a functionally equivalent in-memory dispatch path while holding execution semantics constant (allocation → copy → protection transition → dispatch). For safety and reproducibility, we use a benign inert payload emulator that exercises the same Windows API primitives but performs no network or file I/O. This preserves detection-relevant behavior without distributing weaponizable content. All binaries are scanned on https://kleenscan.com/, which aggregates reporting from 30+ engines. We compare per-artifact outcomes without sharing files with third parties. Unless otherwise noted, no packing, encoding, or obfuscation is applied.

4.1. Artifacts and Build Matrix

We consider four languages and their representative toolchains:

C (native): Visual Studio 2022, Release, static and dynamic CRT variants; deterministic build enabled when available.
C# (.NET 6.0): Visual Studio 2022, Release, AnyCPU/x64, single-file publish off; P/Invoke used only for required Win32 APIs.
Fortran (native): Intel^® Fortran Compiler integrated with MSVC; Release profile; native linkage.
COBOL (native): GnuCOBOL via MSYS2 for Windows; Release-equivalent flags; native calls to Win32.

Each artifact conforms to a shared equivalence oracle: (i) presence of the canonical API sequence; (ii) RW→RX protection transition (no RWX pages); (iii) absence of file/registry/network side effects. We record compiler/linker versions, selected flags, PE/CLR metadata, import table shape, section sizes, and content hashes.

To visually represent the shared logic across all loaders, Figure 1 illustrates the canonical flowchart implemented in all four languages.

4.2. Programming C Language

As shown in Listing 1, the source codes in the C programming language were compiled using Visual Studio 2022 in release mode, and the Run-Time Library was set with

/ M T

and without

P E B

debug symbols [33].

Listing 1. Executing the shellcode in the C programming language.

4.3. C# Programming Language

As shown in Listing 2, the source codes in the C# programming language were compiled using Visual Studio 2022 in release mode and using .NET 6.0, without

P E B

debug symbols.

4.4. FORTRAN Programming Language

As shown in Listing 3, the source codes in the Fortran programming language were compiled using an Intel® Fortran Compiler [34]; this compiler can be used with Microsoft Visual Studio 2019 and 2022. Fortran is primarily used in scientific computing, numerical analysis, and engineering applications [35]. It is favored for tasks that require high-performance computations, such as simulations, complex mathematical modeling, weather forecasting, computational fluid dynamics, and large-scale data processing. Its strength lies in handling large arrays and matrices efficiently, making it a common choice in physics, chemistry, and engineering research.

Listing 2. Executing the shellcode in the C# programming language.

4.5. COBOL Programming Language

As shown in Listing 4, the source codes in the COBOL programming language were compiled using GnuCobol (version 1 3.2-9) with MSYS2. COBOL (Common Business-Oriented Language) is widely used in the financial industry, particularly in banking, insurance, and credit card systems. It excels in handling large-scale financial transactions and batch processing, making it a go-to choice for core banking systems, transaction processing, and accounting operations. Many legacy systems in financial institutions still rely on COBOL due to its stability, scalability, and ability to manage complex data processing tasks. Despite being an older language, it continues to play a crucial role in the financial sector, particularly in maintaining and updating legacy systems that require high reliability.

Listing 3. Executing the shellcode in the Fortran programming language.

Listing 4. Executing the shellcode in the COBOL programming language.

5. Result

5.1. Antivirus Scan Results

We evaluated one functionally equivalent artifact per language under identical build profiles and uploaded each artifact once per engine aggregator session with caches cleared. Engines are reported anonymously (E1–En). For each artifact we recorded: (i) the number of engines that flagged the binary (DR), (ii) any family or heuristic name when available, and (iii) whether the flag was from static inspection or from on-execution emulation. Table 2 and Table A1 lists vendor-facing names for traceability.

Across the four ecosystems, C and C# showed higher static DR than Fortran and COBOL. The effect persists when normalizing by binary size and number of imports. On-execution results reduced the gap but did not remove it. The variance is consistent with differences in runtime libraries, metadata layouts, and how each compiler materializes byte arrays or immediate constants. We observed no flags for samples that failed the equivalence oracle (missing protection transition or extra side effects); such samples were excluded from aggregation.

5.2. Comparison of the Shellcode Patterns

Figure 2 shows the byte prefix extracted from the stub used to exercise the dispatch path; the leading six bytes are fc 48 83 e4 f0 e8. Figure 3 illustrates a simple content check: a hex dump followed by a grep on the prefix. The prefix is present as a contiguous run in the C and C# artifacts but not in the Fortran and COBOL artifacts.

Two causes explain the absence of a contiguous prefix in some builds: (i) when the stub is represented as an array of integers rather than uint8_t/byte, compilers emit interleaved zero bytes due to element width, producing patterns of the form fc 00 48 00 83 00 ...; and (ii) some toolchains materialize bytes through register immediates and stores, which spreads the sequence across instructions and relocations. In both cases the semantic dispatch path is identical, but static pattern matching on short byte runs becomes less reliable.

5.3. Disassembling the Samples

As shown in Figure 4 and Figure 5, panel (a) places the code and data views side by side. The instruction lea rcx, byte_... takes the address of the byte block in the data section, after which the loader allocates memory and calls RtlMoveMemory. This copies the elements into a writable buffer and recreates a contiguous sequence before the protection change and dispatch. Panel (b) enlarges the data representation. The compiler stores the bytes as widened integer elements, which appear in the disassembly as dup(0) patterns and in the hex dump as interleaved 0x00. This preserves content but breaks long contiguous prefixes in the file image, so prefix-based grep and short static signatures are less likely to match. At run time the address-taking and copy steps reconstruct the same buffer seen in other languages, which explains why behavioral checks still observe the allocation, protection transition, and thread start even when static hits are fewer.

As shown in Figure 6, the upper panel shows the compiler emitting the stub under the symbol b_22_22 as individual elements, not as a flat byte array; the file image therefore lacks a long contiguous prefix and simple hex-prefix searches fail. The lower panel shows the use site: lea rdx, b_22_22 takes the address, RtlMoveMemory copies the bytes into a writable buffer, and a contiguous block is recreated before the protection change and dispatch. Static short-run signatures flag this case less often, while behavior-focused checks still observe the copy and execution sequence.

5.4. Per-Language Observations

To provide a clearer, side-by-side comparison of these syntactic and structural differences, Figure 7 visually depicts how the exact same shellcode payload is represented and loaded in C, C#, Python, and COBOL.

C (native): The stub is embedded as a contiguous unsigned char array in a data section, copied once into a heap region, then executed after an RW→RX protection change. This layout yields short contiguous byte runs that survive linking, so simple hex-prefix searches succeed and several engines map them to family or heuristic names (Table 2). On-execution flags typically occur at the permission change or thread creation. The import table is compact and stable; thus static indicators dominate while behavioral cues confirm the event sequence.

C# (.NET 6): The artifact uses managed control flow with P/Invoke for the required Win32 calls. IL metadata, the assembly manifest, and P/Invoke descriptors enlarge the static surface compared with C. The stub appears as a contiguous blob in resources or .rdata, so prefix searches still match and static detections are common. On-execution flags are also present and align with transitions across the CLR boundary and the protection change. The API sequence and RW→RX step remain identical to the native build per the equivalence oracle.

Fortran (native): The compiler stores the stub as wider integer elements, which produces interleaved zeros in the data section (xx 00 yy 00 ...). Copy length and protection change match the canonical path, but the fragmented representation reduces short-run signature matches. Engines then rely more on import/section features or behavioral traces. Fortran runtime support adds initialization symbols, yet this does not reconstruct a contiguous prefix; the observed gap relative to C/C# is therefore attributable to byte materialization rather than semantic differences.

COBOL (native): The toolchain materializes bytes via immediate moves and stores into a buffer, distributing the prefix across instruction boundaries instead of emitting a single array. Allocation, RW→RX, and dispatch follow the same APIs as other languages, satisfying equivalence checks. Hex-prefix searches over the file image rarely return a continuous match, so static short signatures trigger less often, while behavioral checks still record the protection transition and thread start. Runtime initialization affects import density and section layout but not the underlying dispatch semantics.

5.5. Robustness Checks and Limitations

We repeated the scans after rebuilding with deterministic build options when available. The detection ordering by language remained unchanged. We also repeated the grep check using a longer prefix (≥16 bytes) and confirmed the same outcome: contiguous matches in C/C#, fragmented representations in Fortran/COBOL. These checks support the interpretation that byte-materialization strategies, not semantic differences, explain the pattern results. Limitations include aggregator-dependent emulation coverage and vendor policy drift; to mitigate, we report both static and on-execution outcomes and log build metadata for replication.

A further limitation is the study’s specific linguistic scope. As noted in our related work, modern languages such as Go, Rust, and Python are increasingly leveraged for malware development. However, this study intentionally focused on establishing a baseline comparison between mainstream, high-detection languages (C, C#) and legacy languages (Fortran, COBOL). The rationale for this focus is that these legacy toolchains, while less scrutinized by modern security tools, remain in active use in critical financial, scientific, and infrastructure sectors, creating the “legacy code, live risk“ scenario we aimed to investigate. Therefore, a comprehensive experimental comparison incorporating Go, Rust, and Python remains a critical and immediate next step for future work.

6. Conclusions and Future Work

6.1. Conclusions

The experiments conducted on shellcode execution across different languages such as Fortran, COBOL, and C# demonstrate that the core shellcode remained unchanged, but the rates of detection by antivirus software differed. Fortran and COBOL exhibited lower detection rates than C and C#, indicating that language choice may influence the evasion of antivirus mechanisms. This highlights the potential for unconventional programming languages, such as COBOL and Fortran, to be leveraged for stealthier malware, emphasizing the importance of addressing security challenges across diverse programming environments.

It is important to note that this research did not implement any advanced evasion techniques such as encryption, encoding, or obfuscation. The shellcode was tested in its raw form, without modifications intended to bypass analysis tools. However, had techniques such as encryption or obfuscation been applied, the difficulty of reverse engineering and analyzing the malware would likely have increased, further impairing detection by antivirus software. This highlights the potential for malicious actors to leverage both unconventional programming languages and additional techniques to enhance the stealthiness of malware.

The results of this study underscore the potential for older, unconventional programming languages such as Fortran and COBOL to be exploited by attackers seeking to bypass modern security measures. These languages, often associated with legacy systems, are less scrutinized by contemporary antivirus engines, creating a blind spot that sophisticated malware could exploit. In environments where Fortran and COBOL are still used—such as critical infrastructure, financial systems, and scientific computing—this presents a considerable risk. Attackers could leverage the lower detection rates and relative obscurity of these languages to launch stealthier attacks, particularly against systems that rely heavily on legacy code. As such, it is imperative that security strategies evolve to encompass not only modern languages but also those that underpin vital legacy systems. Addressing the security challenges posed by Fortran, COBOL, and similar languages will be crucial in mitigating potential threats and enhancing the resilience of legacy systems in the face of evolving cyber attacks.

6.2. Future Work

Future work will proceed in two main directions: expanding the experimental scope using automated methods and developing new defensive recommendations.

First, the experimental baseline will be expanded. This includes incorporating modern languages popular in malware development, such as Go, Rust, and Python. More importantly, we plan to leverage Large Language Models (LLMs) to automate the exploration of this expanded scope. We will task LLMs to generate functionally equivalent loaders using different coding styles, obscure libraries, and compiler-specific optimizations to systematically discover the most evasive language and build combinations. This expanded testbed will also be complemented by deep behavioral instrumentation, performing fine-grained tracing of the system call sequences and execution timing profiles for each variant to identify precisely why certain builds evade detection.

Second, based on our findings, we suggest specific technical approaches for defensive development, directly addressing the need for expanded practical recommendations. For behavioral analysis, our study confirmed that the canonical API sequence (VirtualAlloc →RtlMoveMemory→VirtualProtect→ CreateThread) is a language-agnostic signal. While EDR systems can hunt for this sequence, a logical evasion step for malware is to substitute these canonical APIs with their undocumented Win32 API equivalents (e.g., using NtAllocateVirtualMemory instead of VirtualAlloc). Therefore, future defensive work must evolve to detect this semantic behavior at the system-call level, regardless of the specific user-mode wrapper. For the integration of machine learning methods, our results show that legacy toolchains produce distinct static artifacts (e.g., import table density, section layouts, PE metadata). This presents a clear opportunity to train ML models on these toolchain fingerprints to distinguish benign legacy applications from anomalous binaries that, while compiled with a legacy toolchain, also contain suspicious indicators.

Author Contributions

Conceptualization, T.-H.L. and G.-C.H.; methodology, T.-H.L. and G.-C.H.; software, T.-H.L. and G.-C.H.; validation, T.-H.L. and G.-C.H.; formal analysis, T.-H.L. and G.-C.H.; investigation, T.-H.L. and G.-C.H.; resources, T.-H.L. and G.-C.H.; data curation, T.-H.L. and G.-C.H.; writing—original draft preparation, T.-H.L. and G.-C.H.; writing—review and editing, T.-H.L.; visualization, T.-H.L. and G.-C.H.; supervision, T.-H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1 shows details of the scan report referred to in Table 2.

Table A1. Malware scan results from Kleenscan.

Programming Language	Scan Result URL (All of URLs Are Accessed on 4 November 2025)
C	`https://kleenscan.com/scan_result/7833dd4ea8728fb29e74abf9e68fdd04569ddffc26f0250e8718316097604ccf`
C# with .NET 6.0	`https://kleenscan.com/scan_result/1bc5c6fe8c7e6d749272aabfd97b8d45b2a960b04d3d75fdea139caa3a7cf66c`
Fortran	`https://kleenscan.com/scan_result/298f2b3875238ed321c9289f1ee5fedfb7131fe13ece8b895945b1cdee067e39`
COBOL	`https://kleenscan.com/scan_result/65a6ad5d1dbef711b644c4478fe901fc76f53d14a8a670b4601ad3de16187a41`

References

Samociuk, D. Antivirus evasion methods in modern operating systems. Appl. Sci. 2023, 13, 5083. [Google Scholar] [CrossRef]
Shneiderman, B. Direct manipulation: A step beyond programming languages. Computer 1983, 16, 57–69. [Google Scholar] [CrossRef]
Alkhateeb, E.; Ghorbani, A.; Habibi Lashkari, A. Identifying malware packers through multilayer feature engineering in static analysis. Information 2024, 15, 102. [Google Scholar] [CrossRef]
Shafin, S.S.; Karmakar, G.; Mareels, I. Obfuscated memory malware detection in resource-constrained IoT devices for smart city applications. Sensors 2023, 23, 5348. [Google Scholar] [CrossRef] [PubMed]
Akhtar, M.S.; Feng, T. Malware analysis and detection using machine learning algorithms. Symmetry 2022, 14, 2304. [Google Scholar] [CrossRef]
Tajoddin, A.; Jalili, S. HM 3 alD: Polymorphic Malware detection using program behavior-aware hidden Markov model. Appl. Sci. 2018, 8, 1044. [Google Scholar] [CrossRef]
Díaz-Verdejo, J.; Muñoz-Calle, J.; Estepa Alonso, A.; Estepa Alonso, R.; Madinabeitia, G. On the detection capabilities of signature-based intrusion detection systems in the context of web attacks. Appl. Sci. 2022, 12, 852. [Google Scholar] [CrossRef]
Aboaoja, F.A.; Zainal, A.; Ghaleb, F.A.; Al-Rimy, B.A.S.; Eisa, T.A.E.; Elnour, A.A.H. Malware detection issues, challenges, and future directions: A survey. Appl. Sci. 2022, 12, 8482. [Google Scholar] [CrossRef]
Owoh, N.; Adejoh, J.; Hosseinzadeh, S.; Ashawa, M.; Osamor, J.; Qureshi, A. Malware detection based on api call sequence analysis: A gated recurrent unit–generative adversarial network model approach. Future Internet 2024, 16, 369. [Google Scholar] [CrossRef]
Chen, T.; Zeng, H.; Lv, M.; Zhu, T. CTIMD: Cyber threat intelligence enhanced malware detection using API call sequences with parameters. Comput. Secur. 2024, 136, 103518. [Google Scholar] [CrossRef]
Messahel, W.; Touili, T. Reachability analysis of concurrent self-modifying code. In Proceedings of the International Conference on Engineering of Complex Computer Systems, Limassol, Cyprus, 19–21 June 2024; Springer: Cham, Switzerland, 2024; pp. 257–271. [Google Scholar]
Dora, J.R.; Hluchy, L. Bypassing Network Activity Monitors using Process Hollowing. In Proceedings of the 2025 25th International Conference on Control Systems and Computer Science (CSCS), Bucharest, Romania, 27–30 May 2025; IEEE: New York, NY, USA, 2025; pp. 237–242. [Google Scholar]
Apostolopoulos, T.; Koutsokostas, V.; Totosis, N.; Patsakis, C.; Smaragdakis, G. Coding Malware in Fancy Programming Languages for Fun and Profit. In Proceedings of the Fifteenth ACM Conference on Data and Application Security and Privacy, Pittsburgh, PA, USA, 4–6 June 2025; pp. 18–29. [Google Scholar]
Chatsomsanga, J.; Benjangkaprasert, C. Malware Developing Guide: Encryption and Decryption. In Proceedings of the 2022 24th International Conference on Advanced Communication Technology (ICACT), PyeongChang, Republic of Korea, 13–16 February 2022; IEEE: New York, NY, USA, 2022; pp. 275–278. [Google Scholar]
Catalano, C.; Specchia, G.; Totaro, N.G. Enhancing Code Obfuscation Techniques: Exploring the Impact of Artificial Intelligence on Malware Detection. In Proceedings of the International Conference on Product-Focused Software Process Improvement, Dornbirn, Austria, 10–13 December 2023; Springer: Chan, Switzerland, 2023; pp. 80–88. [Google Scholar]
Huang, G.C.; Chang, K.C.; Lai, T.H. Evading Antivirus Detection Using Fountain Code-Based Techniques for Executing Shellcodes. Sensors 2025, 25, 460. [Google Scholar] [CrossRef]
Huang, G.C.; Chang, K.C.; Lai, T.H. Chaotic-Based Shellcode Encryption: A New Strategy for Bypassing Antivirus Mechanisms. Symmetry 2024, 16, 1526. [Google Scholar] [CrossRef]
Culic, I.; Vochescu, A.; Radovici, A. A low-latency optimization of a rust-based secure operating system for embedded devices. Sensors 2022, 22, 8700. [Google Scholar] [CrossRef]
Lu, H.; Peng, H.; Nan, G.; Cui, J.; Wang, C.; Jin, W.; Wang, S.; Pan, S.; Tao, X. Malsight: Exploring malicious source code and benign pseudocode for iterative binary malware summarization. IEEE Trans. Inf. Forensics Secur. 2025, 20, 6733–6747. [Google Scholar] [CrossRef]
Chatzoglou, E.; Karopoulos, G.; Kambourakis, G.; Tsiatsikas, Z. Bypassing antivirus detection: Old-school malware, new tricks. In Proceedings of the 18th International Conference on Availability, Reliability and Security, Benevento, Italy, 29 August–1 September 2023; pp. 1–10. [Google Scholar]
Rose, A.J.; Graham, S.R.; Schubert Kabban, C.M.; Krasnov, J.J.; Henry, W.C. ScriptBlock Smuggling: Uncovering Stealthy Evasion Techniques in PowerShell and. NET Environments. J. Cybersecur. Priv. 2024, 4, 153–166. [Google Scholar] [CrossRef]
Arnold, D.; David, C.; Saniie, J. PowerShell Malware Analysis Using a Novel Malware Rating System. In Proceedings of the 2022 IEEE International Conference on Electro Information Technology (eIT), Mankato, MN, USA, 19–21 May 2022; IEEE: New York, NY, USA, 2022; pp. 182–187. [Google Scholar]
Ramamoorthy, J.; Varol, C.; Shashidhar, N.K. APT Warfare: Technical Arsenal and Target Profiles of Linux Malware in Advanced Persistent Threats. In Proceedings of the 2024 8th Cyber Security in Networking Conference (CSNet), Paris, France, 4–6 December 2024; IEEE: New York, NY, USA, 2024; pp. 190–196. [Google Scholar]
Mohamed, N. State-of-the-Art in Chinese APT Attack and Using Threat Intelligence for Detection. A Survey. J. Posit. Sch. Psychol. 2022, 6, 4419–4443. [Google Scholar]
Komninos, T.; Serpanos, D. Cyberwarfare in Ukraine: Incidents, Tools and Methods. In Hybrid Threats, Cyberterrorism and Cyberwarfare; CRC Press: Boca Raton, FL, USA, 2023; pp. 127–147. [Google Scholar]
Chaudhary, P.; Agrawal, L.; Ali, A. Modern programming languages–characteristics and recommendations for instruction. Issues Inf. Syst. 2025, 26, 281–291. [Google Scholar]
Liu, Y.; Cai, R.; Yin, X.; Liu, S. An exploit traffic detection method based on reverse shell. Appl. Sci. 2023, 13, 7161. [Google Scholar] [CrossRef]
Dora, J.R.; Hluchy, L. Process Injection and Migration Techniques: * Strategies to Bypass Security Software of Network Communication after a Successful Reverse shell. In Proceedings of the 2025 IEEE 23rd World Symposium on Applied Machine Intelligence and Informatics (SAMI), Stará Lesná, Slovakia, 23–25 January 2025; IEEE: New York, NY, USA, 2025; pp. 000017–000022. [Google Scholar]
Rykaczewski, K.; Grelewicz, P.; Stebel, K. Characteristics of MSFVenom Software for Linux/ARM Architecture and Its Application for Complete Exploit Design. Authorea Prepr. 2025. Available online: https://www.techrxiv.org/doi/full/10.36227/techrxiv.176127446.67735988 (accessed on 5 November 2025).
Dora, J.R.; Hluchy, L. Advanced Techniques to Execute a Shellcode in Word Memory. In Proceedings of the 2025 IEEE 29th International Conference on Intelligent Engineering Systems (INES), Palermo, Italy, 11–13 June 2025; IEEE: New York, NY, USA, 2025; pp. 000359–000364. [Google Scholar]
Li, Y.; Liu, M.; Cao, C.; Li, J. Communication-Traffic-Assisted mining and exploitation of buffer overflow vulnerabilities in ADASs. Future Internet 2023, 15, 185. [Google Scholar] [CrossRef]
Brizendine, B.; Kusuma, S.S.; Rimal, B.P. Process Injection Using Return-Oriented Programming. IEEE Access 2025, 13, 133790–133816. [Google Scholar] [CrossRef]
Vaidya, R.; Kulkarni, P.A.; Jantz, M.R. Explore capabilities and effectiveness of reverse engineering tools to provide memory safety for binary programs. In Proceedings of the International Conference on Information Security Practice and Experience, Nanjing, China, 17–19 December 2021; Springer: Cham, Switzerland, 2021; pp. 11–31. [Google Scholar]
Tian, X.; Bik, A.; Girkar, M.; Grey, P.; Saito, H.; Su, E. Intel® OpenMP C++/Fortran Compiler for Hyper-Threading Technology: Implementation and Performance. Intel Technol. J. 2002, 6. Available online: https://openurl.ebsco.com/EPDB%3Agcd%3A12%3A36168172/detailv2?sid=ebsco%3Aplink%3Ascholar&id=ebsco%3Agcd%3A6769164&crl=c&link_origin=scholar.google.com.tw (accessed on 5 November 2025).
Alharbi, A.R. A Study of Traveling Wave Structures and Numerical Investigation of Two-Dimensional Riemann Problems with Their Stability and Accuracy. Comput. Model. Eng. Sci. 2023, 134, 2193–2209. [Google Scholar] [CrossRef]

Figure 1. Canonical flowchart of the shellcode loader logic implemented in C, C#, Fortran, and COBOL.

Figure 2. Generation of the 64-bit shellcode using msfvenom.

Figure 3. Using xxd to find the pattern.

Figure 4. Address-taking (lea rcx, byte_...) followed by allocation and RtlMoveMemory to rebuild a contiguous buffer.

Figure 5. Data section and hex dump show widened elements with zero padding (dup(0)/inter- leaved 0x00).

Figure 6. Dissembling the sample in COBOL programming language.

Figure 7. A comparison of different syntactic representations for the same shellcode payload in C, C#, Python, and COBOL.

Table 1. Comparative analysis of programming languages in malware evasion (Related Work Summary).

Language Family	Key Languages	Influence on Malware/Evasion Mechanism
Low-Level (Traditional)	C, C++	Direct memory access, hardware manipulation, and facilitating techniques like packing, polymorphism, and self-modifying code.
High-Level (Managed)	Python, C#, Java	Abuse of powerful libraries and frameworks for dynamic code loading; high-level constructs for rapid development.
Modern (Systems)	Rust, Go	Binaries are inherently difficult to reverse engineer; memory safety features (Rust) and cross-platform capabilities (Go) are leveraged by APTs.
Frameworks/Scripting	PowerShell, .NET	Heavily used for in-memory execution, fileless attacks, and rapid development of ransomware and targeted attack tools.
Advanced Obfuscation (Payload-Focused)	(Used in C# Loaders)	Advanced strategies to fragment payloads (Fountain Codes) or randomize shellcode (Chaotic Encryption) to bypass static analysis.

Table 2. Comparison of antivirus detection for various programming languages.

Vendors	C	C#	Fortran	COBOL
AdAware	Generic.ShellCode.Marte.4.A297A5B3	Generic.ShellCode.Marte.4.DD28A0DB	DeepScan:Generic.ShellCode.Marte.4.36EC5B27	-
Arcabit	Generic.ShellCode.Marte.4.A297A5B3	Generic.ShellCode.Marte.4.DD28A0DB	Dump:Generic.ShellCode.Marte.4.36EC5B27	-
Avast	Win32:MsfShell-V	Win32:MsfShell-V	-	-
AVG	Win32:MsfShell-V	Win32:MsfShell-V	-	-
Avira	-	TR/Rozena.Gen	-	-
Bullguard	Win32:MsfShell-V	Win32:MsfShell-V	-	-
ClamAV	-	Win.Malware.Metasploit-10022275-0	-	-
Emsisoft	Generic.ShellCode.Marte.4.A297A5B3	Generic.ShellCode.Marte.4.74DE52F2	Dump:Generic.ShellCode.Marte.4.36EC5B27	-
G Data	Generic.ShellCode.Marte.4.A297A5B3	Generic.ShellCode.Marte.4.74DE52F2	Dump:Generic.ShellCode.Marte.4.36EC5B27	-
Immunet	-	Win.Malware.Metasploit-10022275-0	-	-
Microsoft Defender	Trojan:Win64/Meterpreter.B	Trojan:Win64/Meterpreter.B	-	-
NOD32	-	MSIL/Rozena.FW.gen trojan	-	-
Norman	Win32:MsfShell-V	Win32:MsfShell-V	-	-
SecureAge APEX	-	-	-	-
VirITExplorer	-	Trojan.Win32.MSIL_Heur.A	-	-
ZoneAlarm	HEUR:Trojan.Win32.Generic	HEUR:Trojan.Win32.Generic	HEUR:Trojan.Win32.Generic	-

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, G.-C.; Lai, T.-H. Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps. Appl. Sci. 2025, 15, 11862. https://doi.org/10.3390/app152211862

AMA Style

Huang G-C, Lai T-H. Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps. Applied Sciences. 2025; 15(22):11862. https://doi.org/10.3390/app152211862

Chicago/Turabian Style

Huang, Gang-Cheng, and Tai-Hung Lai. 2025. "Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps" Applied Sciences 15, no. 22: 11862. https://doi.org/10.3390/app152211862

APA Style

Huang, G.-C., & Lai, T.-H. (2025). Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps. Applied Sciences, 15(22), 11862. https://doi.org/10.3390/app152211862

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Legacy Code, Live Risk: Empirical Evidence of Malware Detection Gaps

Abstract

1. Introduction

2. Related Work

3. Methodology

4. Experiment

4.1. Artifacts and Build Matrix

4.2. Programming C Language

4.3. C# Programming Language

4.4. FORTRAN Programming Language

4.5. COBOL Programming Language

5. Result

5.1. Antivirus Scan Results

5.2. Comparison of the Shellcode Patterns

5.3. Disassembling the Samples

5.4. Per-Language Observations

5.5. Robustness Checks and Limitations

6. Conclusions and Future Work

6.1. Conclusions

6.2. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI