1. Introduction
The increasing complexity and invisibility of modern malware present serious difficulties for conventional security mechanisms. Traditional detection systems, which typically depend on recognizing known patterns or analyzing files, often struggle to address advanced threats that utilize memory and bypass files entirely. The growing gap has pushed the cybersecurity research community to explore new detection methods that address the weaknesses of traditional technologies and keep pace with the evolving nature of in-memory malware. Among these threats, “in-memory malware” (also known as fileless malware) poses a unique challenge because it can execute directly in volatile memory without leaving disk artifacts. This characteristic enables it to bypass signature-based detection mechanisms and complicates forensic investigations [
1,
2,
3,
4,
5].
In-memory malware operates by injecting malicious code into system memory, often leveraging legitimate tools such as “PowerShell” and “Windows Management Instrumentation (WMI)”. Since it leaves no traces in persistent storage, its detection requires specialized forensic techniques [
6,
7,
8]. In-memory attacks are a subtype of low-observable characteristics (LOC) attacks, a class of group attacks that is difficult to detect. Many LOC attacks leverage PowerShell’s capabilities. The PowerShell uses a command-line shell and associated scripting language, and can grant users accesses to virtually everything and anything in Windows [
9].
Figure 1 provides an overview of in-memory malware operation.
Despite the growing emphasis on detecting in-memory malware, most existing approaches rely on postmortem memory analysis. Tools such as Volatility and Rekall operate after acquiring a full or partial memory dump, which introduces delays and makes them unsuitable for detecting ephemeral or rapidly evolving threats [
10]. Recent frameworks like LIFT (Lightweight, Incremental, Federated Techniques) aim to detect threats in near real time by leveraging federated learning across devices. However, LIFT mainly focuses on system behavior features (such as CPU usage, process creation, and network activity) and doesn’t directly analyze memory data, which makes it less effective at detecting advanced malware that doesn’t leave files or exists only in memory. Additionally, some earlier studies [
11,
12] mainly look at finding malware by analyzing data after it has been captured, turning memory dumps into image or table formats before using machine learning. Similarly, other works [
13,
14] focus on examining fixed parts of programs or behavior from memory snapshots, a task that is not possible in real-time systems.
Recent surveys and evaluations indicate that memory forensics for Windows endpoints predominantly prioritizes post-acquisition procedures and resource-intensive learning pipelines, resulting in prolonged time-to-signal and increased operational costs [
15,
16,
17]. Dump-centric classifiers, such as models trained on curated memory images, demonstrate high accuracy; nevertheless, they presuppose the availability of a full RAM image and do not provide live, pre-dump triage on endpoints [
18]. Sequence-based detectors like GRU–GAN enhance dynamic analysis but fall outside the scope of RAM forensics, potentially overlooking exclusively in-memory code injections when behavior traces are insufficient [
19].
Multi-snapshot techniques, such as Multiple-Memory-Images (MMI), enhance detection accuracy for discrete samples; nevertheless, they impose significant time and storage overhead by necessitating the processing of multiple large VMEM files per case, rendering them unfeasible for prompt incident response [
20]. Automated toolchains frequently prioritize processes based on network activity and rely on external trust services, which might result in blind spots or labeling errors [
21]. Fileless-focused VAD feature pipelines, such as MemInspect, necessitate complete memory images and outdated tool stacks, resulting in increased latency and maintenance challenges [
22]. Backup-centric defenses focus on persistence or recovery rather than live, volatile artifacts [
23], while IoT-specific frameworks, although informative for lightweight detection, pertain to domains with limitations that do not completely correspond with enterprise Windows hosts [
24]. The research highlights a deficiency in lightweight, pre-dump, endpoint-friendly RAM inspection that limits scope prior to extensive acquisition and analysis [
15,
16,
17].
To address this limitation, we present MemCatcher, a streamlined live-system technique that examines raw memory segments prior to any comprehensive dump, identifies Windows service processes directly from the Task Manager interface, and systematically prioritizes suspicious targets for targeted acquisition and systematic analysis using
volatility3 and
PEview. By advancing evaluation, our pipeline reduces storage and latency compared to dump-then-analyze methodologies [
15,
18,
20,
22], mitigates excessive dependence on network activity heuristics or external credibility [
21], and enhances the detection of fileless behavior that backup- or network-exclusive strategies might overlook [
23,
24]. All experiments were performed on a Windows 10 endpoint within an Oracle VirtualBox environment to simulate a realistic analysis configuration while assuring safety and repeatability for in-memory malware assessment.
Main Contributions
This study aims to address the critical gap as mentioned in Section Research Gap by proposing a lightweight model for a thorough analysis of in-memory malware, highlighting their diverse characteristics, exploring their underlying mechanisms, and showing the feasibility of automated scripts to detect in-memory malware. The primary contributions of this paper are summarized below:
We built a MemCatcher tool to analyze raw-memory regions on live systems before any dumping occurs, which retrieve benign and malicious service processes from the Windows system task manager.
After identifying suspicious processes, we examined the system’s memory dump and specific malicious processes to detect harmful traces.
Using the volatility3 and PEview tools, we have done an in-depth analysis and assessed the results, and identified suspicious processes.
We developed automated scripts to investigate suspicious processes and also conducted experiments to show the validation of suspicious service process detection.
To the best of our knowledge, this technique represents a novel contribution to the field of memory forensics and malware detection on Windows 10. The rest of the paper are structured as follows: An outline of related works on memory malware detection is provided in
Section 2.
Section 3 covers the proposed approach details in depth and the working of MemCatcher along with the string matching. In
Section 4, the results and analysis are discussed. Finally, the paper is concluded and further research is highlighted in
Section 5.
2. Related Work
The effectiveness of conventional security measures is constantly being challenged in today’s closely connected digital environment. Memory malware has stood out among them as a particularly potent opponent, focusing on sneaky strategies that evade conventional detection techniques. Recent studies have shed light on these advanced malware strains, revealing their complexities, strategies, and potential effects. The examination of volatile memory is fundamental to sophisticated malware detection methods. It provides insight into obfuscated and memory-based malware that circumvents conventional signature-based techniques. Real-time detection methodologies, although theoretically effective, may encounter obstacles related to resource allocation and scalability. Enhancing computational pipelines is crucial for extensive implementation.
To detect in-memory malware, memory forensics techniques have been widely used. Naeem et al. [
25] employed hybrid feature descriptors and a stacked ensemble of CNNs and MLPs from multiple memory dumps, with an accuracy of up to 99.8%. The model is platform-agnostic and proficient in detecting obfuscated malware; yet, it necessitates considerable computational resources because it collects memory dumps periodically, constraining its implementation in resource-limited settings. Bozkir et al. [
26] converted memory dumps into RGB images and used manifold learning to identify patterns. Methods such as UMAP and Random Forest classifiers were employed, attaining 96.39% accuracy in malware detection. The visual method is resilient to obfuscation; nevertheless, its computational expense restricts scalability for real-time applications.
In order to enable in-guest monitoring and profiling through hypervisors, Hsiao et al. [
27] suggested a novel concept for a hardware-assisted memory redirection technique. This method reduces performance overhead and improves transparency, rendering it appropriate for extensive cloud settings. Nonetheless, its dependence on certain hardware and intricate implementation may obstruct deployment to identify anti-forensic malware that seeks to alter kernel structures and memory regions. Palutke et al. [
28] presented detection methods utilizing Rekall plugins to discover concealed memory areas. The research is innovative in revealing sophisticated evasion methods but necessitates manual operations of tools and manual analysis.
Additionally, some studies focused on detecting variation from typical behavior, Lyles et al. [
29] uses the Volatility framework to extract the processes and classified it using machine learning model. Their method effectively identifies fileless malware by focusing on behavioral anomalies. Nonetheless, its dependence on high-quality training data and the incapacity to generalize to novel behaviors present obstacles. Carrier et al. [
30] devised a system for the classification of malware utilizing memory dumps. The methodology achieves excellent classification accuracy with the extracted process features of Windows systems. The method is applicable to forensic investigations but depends on specific feature sets, limiting its adaptability to emerging threats. Subsequently, VolMemLyzer [
31] launched a Python version 3.11.7 -based utility for the extraction of essential kernel-level attributes, attaining elevated true positive rates in malware classification. Their software enables effective feature engineering but is restricted by its reliance on predefined feature sets, hindering adaptability to novel malware variants.
Research Gap
The challenge that previous approaches have not addressed is identifying malware within a computer’s memory during active operation, using learning methodologies independent of external tools or intricate behavioral patterns.
Table 1 summarizes the prior methodologies proposed by the researchers and outlines the limitations of the work. Current real-time solutions can be divided into two main types: those that analyze past data, which require memory dumps and offline checks, and those that rely solely on basic knowledge of the host or general rules, using virtualization or external technologies. In other words, existing real-time solutions are either:
High computational overheads: The conventional method for memory forensics is to dump the entire system’s memory before conducting an investigation. The problem arises when memory stores enormous amounts of data, making the investigation of each system more time-consuming.
Manual forensic efforts: No prior identification makes it easier to find infected malicious Windows service processes that are created by in-memory malware among the running processes. Identifying a malicious application that operates solely in virtual memory might be challenging.
Lack of real-time detection: When malicious software exists in the system, the malicious process can be difficult to be identified because it disguises as a legitimate Windows service process.
Our proposed research aims to solve these challenges by examining the effectiveness of directly examining the system’s active processes to enhance the detection of in-memory malware.
3. Our Proposed Method
This section outlines our approach for detecting in-memory malware processes within the targeted Windows operating system. This paper proposes two algorithms: MemCatcher, which focuses on detecting suspicious processes across all running processes in the system, and StringMatcher, which aims to provide deep insight into suspicious processes to identify code-section artifacts associated with malicious in-memory malware.
The framework’s process comprises two main phases: Phase 1, as shown in
Figure 2 and
Figure 3. The primary purpose of the MemCatcher algorithm is to identify suspicious processes. Phase 2 is to identify malicious code residing in the process’s virtual memory. This is achieved by locating the hidden injected code data within the code or text section of suspected processes. If a process is identified as suspicious based on static characteristics in phase 1, the StringMatcher algorithm examines and compares strings from the code segment of suspicious processes to detect any modifications.
This section outlines our approach for detecting in-memory malware processes within the targeted Windows operating system. We propose two algorithms: MemCatcher, which identifies suspicious processes among all running processes, and StringMatcher, which performs an in-depth analysis of those suspects to reveal code-section artifacts associated with malicious in-memory activity. The framework operates in two phases (see
Figure 2 and
Figure 3). The purpose of Phase 1 (MemCatcher) is to surface suspicious processes for deeper inspection.
Within this framework, MemCatcher calculates a per-process suspicion factor utilizing lightweight intrinsic runtime data, such as execution from user-writable folders, PID/PPID abnormalities, atypical memory-thread profiles, irregularities in loaded modules, and I/O patterns suggestive of registry interaction. This evaluation highlights a brief list of potential issues for all currently running processes. Phase 2 aims to identify malicious code residing in a process’s virtual memory by locating hidden, injected data within the code or text sections of suspected processes. If a process is flagged in Phase 1, StringMatcher examines and compares discriminative strings from its code segment against clean baselines to detect section-level modifications.
More specifically, StringMatcher conducts section-aware verification by extracting distinctive strings and bytes from each code or text section of Windows services and comparing them with legitimate Windows system baselines to identify injected or altered regions. Significantly, our model includes registry-backed persistence: the malware retains a binary image in the Windows Registry, and a loader regenerates it at startup, thereby reinfecting the system even after reboot. StringMatcher correlates section anomalies with registry-access patterns, linking in-memory artifacts to this persistence mechanism and distinguishing genuine injections from benign counterparts.
Following a general overview of our framework, we will explain the design and implementation of Phase 1 and Phase 2, providing a comprehensive explanation of our proposed algorithm, which will be explored further in
Section 3.1 and
Section 3.2.
3.1. Phase 1: Suspicious Process Detection
The methodology employed consists of four steps, as illustrated in
Figure 2, where the analysis involves setting up a Windows virtual environment using VirtualBox. This environment is used to search for in-memory malware features in potentially malicious processes that persistently run on the system.
Step 1: During the initial round of analysis and detection, we initiated the Clean Windows system and captured a snapshot of the benign process that was currently running in the system’s memory.
Step 2: After setting the environment to initiate the analysis, we installed in-memory malware executable file samples.
Step 3: To search for artifacts resulting from these malware samples in system memory, it is essential to reboot the system because the malware retains a binary image in the Windows Registry. Executable files engage in harmful operations within specific system services shortly after the system is rebooted.
Step 4: Detecting potentially harmful services and pinpointing their critical characteristics within in-memory malware requires considerable efforts. The Python-based script MemCatcher, which is explained in
Section 3.1.1 MemCatcher Algorithm, starts iterating over the currently running processes in the task manager to look for suspicious processes initiated by the malicious samples. MemCatcher conducts iterative scans and retrievals of each service process occurrence to discern essential details or features such as its name, process identifier (PID), parent process identifier (PPID), status, threads, disk, and memory usage.
Step 5: After obtaining all active processes described in Step 4, a CSV file will be created to save the extracted characteristics of suspicious and benign services. This report is required for an in-depth analysis, as detailed in the StringMatcher
Section 3.2.1 StringMatcher Algorithm on the identified suspicious processes.
3.1.1. MemCatcher Algorithm
As previously discussed in steps 4 and 5 in
Section 3.1, this subsection specifies the detailed implementation of the MemCatcher technique. This algorithm’s primary objective is to gather and export specific features of all running processes on the operating system.
The structure of the MemCatcher script follows a rule-based approach to detecting suspicious processes. In the initial step, we identify suspicious processes based on parent/child relationships by extracting features such as the Parent Process ID (PPID) for each process. Secondly, we have to keep the original system processes linked to Microsoft Windows. We cannot delete or stop the original system processes, such as svchost.exe, so if a process is stopped, it is considered a suspicious service process.
The third criterion emphasizes that the disk value, specifically the process input/output (PIO) count, must not be empty, except for the Windows system process known as the System Idle Process.
As displayed in the Algorithm 1, the algorithm extracts significant characteristics, including the name, PID (process ID), PPID (parent process ID), status, memory usage, session name, and session number. We ran the MemCatcher script on the compromised Windows system to export a list of suspicious processes alongside benign processes. Below are the equations that explain the mathematical representation of the MemCatcher algorithm.
Let the set of all processes on the system be:
where the
P is the set of all the
n processes running on the Windows system.
For each process
, extract the attributes:
where info(
p) is the set of seven important feature attributes extracted from the running windows processes on the system.
Let
E be the set of processes that raise exceptions:
where
E is the set of processes that fall into the exception phase while extracting the running processes from the system.
Then the final set of valid process information:
Write the list to a CSV file:
The Equation (
1) defines the complete set
P of processes running on the Windows system. Equation (
2) models the extraction of relevant attributes from a given process
p into a structured tuple, including the name, PID, parent PID, status, memory usage, thread count, and a constant disk label (“pio”). Equation (
3) defines the set
E containing processes for which attribute extraction fails due to access errors, such as
NoSuchProcess or
AccessDenied. Equation (
4) filters out the failed processes by computing the set difference
, retaining only accessible and valid processes. Finally, Equation (
5) represents the step of writing the filtered list to a CSV file named
processes.csv. Together, these equations represent a complete mathematical abstraction of the MemCatcher process extraction algorithm.
Initially, we ran the MemCatcher script directly in the Jupyter notebook version 7.2.2 by importing the initial Windows APIs. However, this process was time-consuming and required the IDE to run the script on every Windows system. To overcome this limitation, we exported the script as MemCatcher.exe, which is later explained in the
Section 4.1.
| Algorithm 1: MemCatcher Processes Extraction for Windows OS |
Data: Windows Operating System Result: CSV file containing process information; Initialize an empty list processes; For each running process do: Try: name← process name; pid ← process ID (PID); ppid ← parent process ID (PPID); Status ← stopped or running; mem_usage ← memory usage in kilobytes; Disk ← ‘Programmed input/output(pio)’; Threads ← ‘running within a process’; processes.append({ ‘Image Name’: name, ‘PID’: pid, ‘Status’: Status, ‘Disk’: Disk, ‘Threads’: threads, ‘Memory Usage (KB)’: mem_usage, ‘PPID’: ppid }); Catch NoSuchProcess, AccessDenied, ZombieProcess: Write the list processes to a CSV file ‘processes.csv’
|
3.2. Phase 2: In-Depth Analysis
In phase 1, feature values for all active processes, both suspicious and benign, are extracted and stored in a CSV file. In phase 2, suspicious processes listed in the CSV file are analyzed in detail, as depicted in
Figure 3. The analysis was carried out using the conventional method of memory forensics, as recommended by multiple researchers [
33,
34,
35].
A full snapshot of the previously infected Windows system has been obtained in order to investigate potentially harmful processes. To acquire the virtual memory of the system named
Win10x64mal.mem, the Forensic Toolkit (FTK) imager tool is used [
36,
37]. Following the acquisition of the virtual memory dump, the subsequent phase in the identification of harmful actions, specifically in the field of memory forensics, involves the examination of individual programs within the memory dump. In our research, we used the Volatility3 tool, developed by the Volatility Foundation [
38], among other current approaches [
37,
39,
40,
41,
42] to examine the memory dump file of a system.
The subsequent phase involves the utilization of the different volatility3 plugin commands ‘windows.pslist’ to retrieve static characteristics of processes, as well as to capture information pertaining to the services that are operating on a given system. The analysis of the active processes list, which provides a hierarchical representation of the parent-child relationship between processes, may uncover services that are potentially harmful in phase 1. An example of this would be when the svchost.exe process is not functioning as a subordinate process under the services.exe process.
After suspicious processes have been identified in phase 1, our focus shifts to examining all the code sections within these infected service processes. For a comprehensive examination of processes, it is crucial to consider memory segments that may retain injected code within a memory dump file. In order to accomplish this task, we have undertaken the action of capturing the memory dump of identified suspicious processes and afterward began an investigation of these process dumps using PEviewer version 3.5.0.25 [
43,
44,
45,
46]. We compare the code sections of the suspicious processes with those of known benign process dumps, as detailed in the upcoming
Section 3.2.1 StringMatcher Algorithm, to detect the presence of malicious code within the suspicious processes.
3.2.1. StringMatcher Algorithm
The concept of performing string matching on suspicious processes stems from the immutable nature of the stored code section in every benign service process on a Windows operating system, which consistently remains unchanged.
We have continued this analysis by implementing some of the previously mentioned actions in
Section 3.2, such as dumping the process’s memory and viewing its contents using the PEviwer version 3.5.0.25 tool.
We found that the initial indicator of a suspicious service process had a different file offset of entry point and a different parent process ID (PPID), as shown in
Figure 4 and
Figure 5.
First, we have exported 512 bytes of the first hexadecimal values in the code section for every benign and suspicious service process and stored them in a CSV file to continue the string matching classification process. The service processes in the exported CSV file, which contains two primary features: the PID and hexadecimal values extracted from a part of the code section. We have predefined certain parameters to facilitate similarity analysis. Below are the equations that explain the mathematical representation of the StringMatcher algorithm.
Let the dataset be the following:
where
is a string representing a process and
is its process ID.
Let
and
. Row-level classification is established as follows:
With
, the PID-level classification is established as follows:
The final output set is as follows:
The Equation (
6) defines the dataset
D as a collection of strings and their corresponding process IDs. Equation (
7) applies a row-level classification of a string based on the normalized string matching method. Equation (
8) isolates the reference string and tries to find the exact matching string PID for the classification. Finally, Equation (
9) forms the output as a set of PID-to-label mappings. These equations collectively express the logic of string classification for finding the benign and malicious Windows service processes.
We have initialized the StringMatcher method using the simple string-matching method in Python. The initial step involved normalizing the string designated for text embedding to the ASCII format. Each raw string is initially confined to text and may be optionally stripped of leading and trailing whitespace. Missing values are assigned to the empty string to provide complete coverage.
The algorithm subsequently selects a standard reference for normalization if one is provided; otherwise, it automatically selects the most common normalized value from the dataset. Step 2 of the Algorithm 2 each row is designated as benign just if its normalized string precisely corresponds to this reference; otherwise, it is classified as malicious. For PID-level evaluations, Step 3 of the Algorithm 2, we employ moderate aggregation criteria: A PID is classified as malicious if any of its rows are malicious, and it is considered benign only if all rows are benign. The method is entirely predictable, operates in linear time with respect to the number of rows, and uses minimal memory proportional to the number of unique normalized strings, resulting in a straightforward and transparent string-matching pipeline.
| Algorithm 2: StringMatcher Classification for Benign and Malicious Processes |
![Applsci 15 11800 i001 Applsci 15 11800 i001]() |
4. Results and Discussion
This section describes experiments and their results.
Section 4.1 discusses the experimental setup, including dataset details in
Section 4.2. The
Section 4.3 provides a detailed explanation of the evaluation results, which are based on the following three research questions:
RQ1: Is the proposed method capable of detecting suspicious svchost processes? In this RQ, we aim to explore whether the proposed method, MemCatcher, can detect all suspicious processes running on Windows that exhibit known in-memory malware characteristics.
RQ2: Do the code sections of processes differ between malicious processes and benign svchost processes?
The objective of this RQ is to analyze code sections across various service processes using the StringMatcher algorithm and classify them as either malicious or benign.
RQ3: Is the proposed method attainable in real environments? This RQ discusses our proposed method and its significance in detecting in-memory malware among various running processes on the Windows operating system.
4.1. Experimental Setup
We conducted experimental investigations using Windows 10 64-bit on a VirtualBox virtual machine. An Intel(R) Core(TM) i5-4460 CPU, with a processor speed of 3.20 GHz and 10 gigabytes of RAM, powers the workstation. We used the Anaconda Python distribution and Jupyter Notebook for all experiments, incorporating essential libraries such as psutil, pywin32, and psutil for the implementation of MemCatcher. Following the design of the MemCatcher script, the subsequent setup should prioritize optimizing its efficiency and ensuring platform independence for execution on any Windows operating system. To eliminate the constraints of platform and IDE dependencies, we have attempted to export the MemCatcher script as an .exe file using the “Auto Py To Exe” tool. The Memcatcher.exe is able to extract and store all the running Windows system processes, as shown in
Figure A1,
Figure A2 and
Figure A3. Refer to
Table 2 for comprehensive hardware and software specifications for phase 1.
We conducted the phase 2 experiment using the same specifications as for Windows 10. Before starting the string-matching process, we used a tool such as FTK Imager to capture the infected system’s virtual memory. We then used Volatility 3 to dump the binary file of a specific process and verified the dump using PEViewer. We implemented the StringMatcher design in Python and used the Anaconda distribution for all research, incorporating essential libraries such as NumPy, Pandas, and Scikit-Learn. The precise hardware and software specs for phase 2 are detailed in
Table 3.
We will go into detail about MemCatcher, which aims to systematically enumerate all active Windows processes and persist rich feature extraction, such as PID, process name, executable path, integrity level, full command line, parent PID, and disk I/O counts, via a self-contained executable. To eliminate interpreter and IDE dependencies and ensure portability across Windows hosts, we packaged the implementation with Auto-Py-to-Exe, enabling execution without a local Python installation.
On the other hand, the StringMatcher phase centered on forensic acquisition and analysis: we captured volatile memory from an infected VM, extracted process-resident binaries for inspection, verified Portable Executable (PE) integrity, and evaluated both deterministic string-matching and learned-similarity baselines implemented in Python, NumPy, Pandas, and scikit-learn for model-based comparisons.
4.2. Dataset Details
The most recent memory malware dataset is provided by CIC-MalMem-2022, an open-source dataset from the Canadian Institute of Cybersecurity (UNB) [
30]. The dataset includes three main categories of malware: Trojan horses, ransomware, and spyware. The dataset, however, only includes the feature record of 58,556 samples, with an equal distribution of 29,298 benign and 29,298 malicious samples in a CSV file, which is not helpful for our research experiment because of the following:
The actual binary files are not provided.
Only a part of the provided information is generated from in-memory malware. They included all the malware samples that exploit memory-based vulnerabilities. So, some of the samples are not fileless malware.
This paper utilizes a dataset of malicious files obtained from the AhnLab-V3 report, as detailed in
Table 4. These files relate to additional malicious attacks and in-memory malware processes.
Our primary objective is to identify the characteristics of in-memory malware across all provided data. Our proposed method, MemCatcher, identified two significant cases of in-memory malware, resulting from the Windows operating system deceiving the svchost.exe processes.
Each sample is uniquely identified by its SHA-256 hash and annotated with the file type of mostly PE executables, plus one DLL and one PowerShell script. The size of these files ranges from 570 bytes to 5 MB, and they are categorized into malware families such as Downloader and Spyware, which include the LokiBot botnet or spyware, the Rebhip spyware or worm, ransomware, backdoors, and their observed in-memory behaviors. The in-memory characteristics consistently include code injection into trusted system processes, creation of masquerading system binaries, and persistence via the Windows registry or malicious services. Several samples specifically target svchost.exe, dllhost.exe, explorer.exe, or processes labeled mci.exe and server.exe, reflecting a strategy of hiding within legitimate hosts to evade file-based detection.
Our analysis highlights representative flows spanning the backdoor and spyware categories. In the first, a staged injector writes and launches a component named 932.mci.exe, then injects it into the victim system’s service.exe process. In the second, closely related pattern, the injector dispenses with staging; after restarting, it directly spawns a counterfeit svchost.exe and establishes persistence by modifying registry keys.
These malicious processes, as mentioned in
Table 4, fall into two separate malware categories: backdoors and spyware. Initially, a malicious process aims to write the process memory name as
932mci.exe and inject it into the victim system’s
Server.exe process. After injection, the malicious injector system creates a fake
svchost.exe process on the victim machine when the operating system restarts. Once the operating system restarts, the second malicious process acts as an injector, directly generating the false
svchost.exe process and persisting in the registry of the victim machine.
4.3. Evaluation Results
4.3.1. RQ1: MemCatcher Results of Detecting Suspicious Svchost Processes
We developed the MemCatcher algorithm to detect suspicious processes operating on the Windows system, as detailed in Section MemCatcher Algorithm, which aligns with the results presented in this RQ. We used the designated rule-based detection method to distinguish between benign and suspicious processes.
As demonstrated in
Table 5, the initial Windows service process,
services.exe, serves as the parent process under which each
svchost.exe process operates. For instance, process IDs
700 and
344 indicate that their parent process is
572, which corresponds to the exact process ID of
services.exe. As a result, we classify these processes as benign. Additionally, the
status of these processes is
running, and the
disk parameter, which reflects the process input/output (pio), is
not empty, thereby satisfying all criteria for benign processes.
However, the process IDs
5388 and
6580 do not operate under the same parent ID
572 in this case. The process ID
5388 is a newly identified suspicious process, with its parent process being the malicious
Server.exe which has the process ID
4340. The other parameter indicates whether this process is
stopped and the disk’s input/output is
empty of activity, which entirely negates the criteria for being classified as a benign process. The system does not include Process
6580 unique parent process ID among its active processes, and it is currently in a stopped state with no disk activity. We have also seen the memory usage of the processes, even for the suspicious ones that exhibit no disk activity; the space occupied by those processes seems questionable, as shown in
Table 5.
Additionally, we have detected the malicious processes mentioned in
Section 4.2 including
Server.exe, which was discussed before in this section, generates new processes such as
svchost.exe as shown in the
Figure 6. Furthermore, we have detected a malicious process named
932mci.exe that is currently active on the system.
Table 5 displays the disk information for each process in detail. We have uploaded all the results and the MemCatcher.exe file to the GitHub web repository [
47] for verification.
4.3.2. RQ2: StringMatcher Evaluation Result to Analyze the Code Section for Suspicious Service Processes
Based on our second research question, this subsection presents the comprehensive results of our further investigation into the StringMatcher algorithm discussed in
Section 3.2.1 StringMatcher Algorithm. During this investigation, we attempted to classify each suspicious service process as either benign or malicious, in accordance with Windows operating system rules. To detect malicious processes, we conducted classification based on a Simple string matching method between the strings in the dataset based on the standard reference for the string.
Figure 7 illustrates the classification of each process string for the Windows svchost.exe process.
4.3.3. RQ3: Is the Proposed Method Attainable in Real Environments? Discussion
The current research employs a Python-based MemCatcher script that utilizes a rule-based method for detecting suspicious processes automatically. By directly executing this script on the Windows operating system, it exposes the characteristics of in-memory malware processes before memory dumping, which is a challenging task that requires significant time to analyze each process individually. In the proposed approach, whenever suspicious activity is detected, the memory is dumped, and the resulting dump data is assessed using the StringMatcher technique to determine whether it aligns with the specific attributes of a benign service process. The final report provides crucial forensic details regarding any suspicious in-memory malware processes identified within the memory, helping to assess the risk and mitigate its potential impact.
Our detection pipeline functions in linear time, as each input element is processed exactly once with constant work per item: MemCatcher executes a single pass over processes with fixed metadata queries and streamed writes, while StringMatcher normalizes each string and conducts a constant-time equality verification against a single reference. Thus, the overall runtime is proportional to for the number of processes and for string data, with constant working memory required beyond the data container and a minimal hash map for mode selection.
We validate this assertion through scaling experiments, runtime versus N and total characters T, high linear regressions, bootstrapped confidence intervals for runtime and throughput, and peak-memory curves that remain stable under chunked streaming, as empirical runtime: 3.51 ms, empirical memory: current = 0.013 MB, peak = 0.031 MB, data frame memory: 0.507 MB. illustrating predictable and efficient performance on large-scale memory data.
Generally, as shown in
Table 6, the majority of conventional techniques attain satisfactory accuracy just following a complete memory dump and at a medium to high resource expenditure [
25,
26,
29,
30,
31]. These are effective for post-hoc forensics but do not identify threats in real-time during system response. The one real-time solution in the list [
27], relies on hardware/VMI support, which constrains its deployability despite its minimal latency and the studies by [
32] are predominantly qualitative.
In comparison, the Proposed (Ours) row presents the sole combination of high effectiveness, real-time capability (pre-dump), and low resource consumption on standard Windows systems. This stems from the design: MemCatcher performs a single, efficient snapshot of active processes and transmits only critical metadata, thereby minimizing CPU and RAM usage, whereas simple StringMatcher employs deterministic normalization and precise classification of suspicious and legitimate service processes. The pipeline operates in linear time with respect to the volume of data processed, sustains a nearly constant working set through streaming, and is delivered as a single executable without the need for specialized hardware, a Python runtime, or extended acquisition periods.
In summary, MemCatcher combined with StringMatcher achieves an optimal balance between offline machine learning and forensics, which offers high accuracy but lacks real-time capabilities, and hardware-assisted monitors, which provide real-time functionality but are cumbersome and intricate. This combination enables actionable, pre-dump detection with consistently low overhead, while maintaining low latency and resource usage suitable for ongoing operation.
5. Conclusions and Future Work
In this paper, we introduce MemCatcher, a highly efficient rule-based in-memory malware detection technique. By focusing on process injection attacks, we demonstrate that automated analysis is a more effective means of identifying such threats. Malicious service processes execute entirely in the system’s memory via an active parent-child structure, making them difficult to detect with traditional methods. MemCatcher addresses this challenge by providing a platform-independent solution for the Windows operating system. This novel approach significantly reduces the time and resources required for detection, while also ensuring compatibility with other operating systems like Linux, iOS, and mobile devices.
Our research lab has provided a comprehensive dataset to validate the effectiveness of our proposed method. The results were evaluated using relevant metrics, confirming MemCatcher’s high accuracy and efficiency. This research marks a significant step towards developing a new automated system capable of addressing in-memory malware challenges. Moreover, the suggested methodology can be expanded to incorporate additional tools, such as StringMatcher, to create a unified automated strategy for examining active processes on the system. This will further enhance the overall effectiveness of our detection techniques.