An NDIR System with a Synergistic CNN-SVM Model for Discriminating CH4 in Complex Alkane Mixtures

Zhang, Zhaoliang; Zhu, Juxiang; Pan, Fei

doi:10.3390/pr13123948

Open AccessArticle

An NDIR System with a Synergistic CNN-SVM Model for Discriminating CH₄ in Complex Alkane Mixtures

by

Zhaoliang Zhang

^1,2,

Juxiang Zhu

^1,* and

Fei Pan

¹

School of Transportation and Vehicle Engineering, Wuxi University, Wuxi 214105, China

²

Jiangsu Provincial Engineering Research Center for Monitoring and Assessment of Industrial Environmental Hazardous Factors, Wuxi University, Wuxi 214105, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(12), 3948; https://doi.org/10.3390/pr13123948

Submission received: 28 October 2025 / Revised: 30 November 2025 / Accepted: 3 December 2025 / Published: 6 December 2025

(This article belongs to the Section Chemical Processes and Systems)

Download

Browse Figures

Versions Notes

Abstract

The selective identification of CH₄ in alkane gas mixtures remains challenging due to overlapping infrared absorption spectra among alkane species. This study introduces a novel algorithmic filter paradigm that fundamentally shifts from hardware-based to software-defined selectivity in Nondispersive Infrared (NDIR) sensing. Instead of relying on costly, fixed-wavelength optical filters, we employ a simplified four-source NDIR platform that deliberately captures composite spectral signals from mixed gases. A CNN-SVM hybrid model then serves as the algorithmic filter: the Convolutional Neural Network extracts discriminative features from overlapping spectra, while the Support Vector Machine performs robust classification. This integrated system achieved 89% accuracy in CH₄ identification within complex alkane mixtures. By replacing expensive optical components with intelligent algorithms, this work demonstrates a cost-effective, flexible, and scalable approach.

Keywords:

NDIR; CH₄ identification; reference detection; discrimination system; neural network

1. Introduction

CH₄, the primary component of natural gas, is a vital energy carrier in modern industrial processes but also poses significant challenges for process safety and environmental monitoring [1,2]. Owing to its flammability, reliable CH₄ monitoring is required in industrial plants, energy infrastructure, and residential environments [3,4]. Beyond safety hazards, CH₄ is also a potent greenhouse gas—second only to carbon dioxide in terms of global warming potential [5,6,7,8]. In many practical process streams, CH₄ coexists with other light alkanes such as ethane and propane, which exhibit very similar physicochemical properties and, critically, strongly overlapping infrared absorption spectra [9,10]. This spectral congestion makes it difficult for conventional gas analyzers to selectively identify CH₄ in real time, yet accurate composition information is essential for process optimization, safety management, and emission control.

Beyond detecting single target gases, there is a growing need to discriminate the dominant component in complex alkane mixtures [11]. Such capability is important for monitoring reaction pathways, optimizing natural gas processing, and diagnosing early faults in oil-immersed power transformers via dissolved gas analysis [12,13,14]. While intelligent sensing and machine-learning-assisted process control have advanced rapidly in recent years [15], most gas sensing studies still focus on improving the detection accuracy of individual gases under relatively simple conditions. The selective identification of CH₄ as the principal component in multicomponent alkane environments remains comparatively underexplored.

Among available techniques for industrial gas monitoring, nondispersive infrared (NDIR) spectroscopy is particularly attractive because of its wide dynamic range, long-term stability, and independence from ambient oxygen, which distinguish it from electrochemical or semiconductor-based sensors [16]. However, the selectivity of traditional NDIR sensors relies on the spectral separation of absorption bands and the use of narrowband optical filters based on the Beer–Lambert law [17]. When target gases have closely spaced or overlapping absorption features—as in the case of CH₄ and other light alkanes—single-wavelength NDIR configurations suffer from severe cross-interference and often require expensive, application-specific interference filters. These constraints limit both the performance and scalability of conventional NDIR systems for component discrimination in complex industrial gas mixtures.

To address this limitation, we propose in this work a software-defined algorithmic filter for CH₄ discrimination in alkane mixtures. Instead of relying on fixed narrowband optical filters, a four-source, single-detector NDIR architecture is employed to generate a rich composite absorption signal. Three measurement wavelengths probe different parts of the alkane absorption band, while a fourth reference wavelength with minimal gas absorption provides a baseline for compensating non-selective effects. The resulting multi-wavelength time-domain signals are transformed into compact feature vectors using Fast Fourier Transform (FFT) and reference-channel normalization. These features are then processed by a hybrid Convolutional Neural Network–Support Vector Machine (CNN–SVM) model, where the CNN automatically learns high-level representations and the SVM performs robust classification of the dominant component.

Using CH₄–ethane mixtures as a representative and challenging case, we demonstrate that the proposed system can reliably discriminate whether CH₄ is the principal component of the mixture, achieving an overall identification accuracy of 89%. This study thus shifts the burden of selectivity from costly optical hardware to flexible computational algorithms and illustrates a generalizable approach for algorithmic selectivity in spectrally congested gas sensing problems. The main contributions of this work can be summarized as follows:

1. A custom hardware architecture was developed, employing a four-source, single-detector configuration. This multi-wavelength strategy produces a rich, composite absorption signal without the need for optical filters.

2. A hybrid intelligent model was proposed, combining a Convolutional Neural Network (CNN) for automated spectral feature extraction with a Support Vector Machine (SVM) for robust classification. This model demonstrated superior performance compared to conventional machine learning approaches.

3. Fast Fourier Transform (FFT) was applied to raw sensor outputs to extract discriminative amplitude-domain features, providing a robust representation for downstream classification tasks.

4. A comprehensive dataset of mixed alkane gases was collected, and experiments confirmed that the proposed system achieves a CH₄ identification accuracy of 89%, thereby validating its effectiveness for real-world sensing applications.

2. Proposed Detection System

Conventional single-wavelength NDIR sensors perform well when the absorption bands of the target and background gases are well separated, but they suffer from severe cross-interference in alkane mixtures such as methane–ethane, where the infrared spectra are strongly overlapping. As a result, commercial analyzers are generally not able to identify the principal component of an unknown alkane mixture without prior compositional information.

To address this limitation, we propose a multi-wavelength differential NDIR scheme combined with a software-defined algorithmic filter. The core of the system is a four-source, single-detector configuration. Three infrared LEDs are selected to probe different regions of the alkane absorption band, whereas a fourth reference LED operates at a wavelength with negligible gas absorption. The three measurement channels provide complementary sensitivity to variations in mixture composition, while the reference channel supplies a baseline for correcting non-selective effects such as source drift or particulate scattering. Instead of aiming at direct spectral interpretation, the system is designed to encode the gas composition into a rich composite signal that can be decoded by machine-learning algorithms.

The optical components share a common-path configuration, i.e., all four beams pass through the same gas chamber and are detected by the same photodetector. This arrangement minimizes systematic errors due to hardware mismatches among channels and ensures that the recorded signals are highly correlated and thus well suited for differential analysis. The overall concept of the proposed NDIR platform is illustrated in Figure 1.

In operation, the four infrared LEDs are sequentially driven by a microcontroller at a fixed modulation frequency. For each active source, the transmitted radiation passes through the mixed-gas chamber and is converted into an electrical signal by the photodetector. The signal is then conditioned by an analog front-end circuit (amplification and anti-alias filtering) and digitized by the on-board ADC. The digitized samples are finally transferred to a host computer, where signal processing and feature extraction are performed, and the resulting feature vectors are fed into the CNN–SVM classifier described in Section 5.

This architecture is intentionally kept hardware-minimal: it uses inexpensive broadband LEDs, a single detector, and simple electronics, while shifting the burden of selectivity from narrowband optical filters to the software-defined algorithmic filter implemented in the data-processing and classification pipeline. The system architecture is shown in Figure 2.

3. Hardware System

The hardware was designed with a singular goal: to reliably produce a high-fidelity, multi-wavelength absorption signal optimized for subsequent analysis by our neural network model. Every design choice serves to enhance the quality and discriminatory power of the data fed into the algorithm.

3.1. Optical Subsystem

The selection of the detection elements is primarily determined by the infrared absorption characteristics of the target gas. CH₄ exhibits a strong absorption band near 3.4 µm, which partially overlaps with the absorption bands of other alkane gases (e.g., ethane, propane) that often coexist with CH₄ [18,19]. Consequently, the optical design must ensure high sensitivity within this spectral region, which directly informs the choice of both the infrared (IR) light source and the detector.

Light Source and Detector Selection: To capture the nuances of this spectral region, we selected an LED array integrating four distinct chips (Lms34LED, Lms35LED, Lms36LED, and Lms37LED) and a corresponding photodiode (Lms36PD-05) with a sensitivity cutoff around 3.6 µm. This specific combination ensures that our system is highly sensitive to the characteristic absorption features of CH₄ while also gathering spectral information from adjacent bands. This multi-point spectral sampling is precisely what provides the rich input required for the CNN to extract subtle discriminatory features that a single-wavelength system would miss.

Four-Source, Single-Detector Design: A key design choice was the “four-source, single-detector” common-path configuration. This approach is instrumental for the algorithm’s success because it ensures all four wavelength signals travel through the identical optical path and are processed by the same electronics. This inherently minimizes inter-channel inconsistencies and cancels out systematic errors, guaranteeing that the subtle differences in the signal are due to gas absorption, not hardware variability. This provides a clean, consistent, and reliable dataset for the algorithm.

Gas Chamber: The effective absorption path length, determined by the gas chamber, directly impacts the signal-to-noise ratio. Through experimental optimization, a chamber length of 80 mm was found to provide a sufficient absorption signal for robust detection without significant attenuation. The chamber was fabricated using 3D-printed resin for its chemical inertness and ease of prototyping.

3.2. Circuit Subsystem

The circuit subsystem was engineered to precisely control the light sources and to amplify and filter the weak photodetector signal with maximum fidelity, ensuring the signal fed to the ADC is a clean representation of the optical absorption.

System Control: A microcontroller (MCU) (STM32F407ZET6) serves as the control core, orchestrating the timing of the LEDs and the acquisition of the detector signal.

Light Source Drive and Signal Modulation: To enable subsequent lock-in amplification—a crucial technique for extracting weak signals from noise—the MCU generates a 10 Hz Pulse Width Modulation (PWM) signal. This signal, via a simple driver circuit, sequentially drives the four infrared LEDs. The 10 Hz modulation frequency was deliberately chosen as an optimal trade-off: it is low enough to allow for effective sampling and to avoid high-frequency harmonic distortions, yet high enough to be well above the typical low-frequency (1/f) noise floor of the electronics. This modulation strategy is a fundamental prerequisite for achieving the high signal-to-noise ratio needed by the subsequent analytical algorithm.

Signal Conditioning: The raw output from the photodetector is a minuscule current signal (on the order of µA or nA) that must be carefully processed. The signal conditioning chain was designed with several key functions in mind:

Current-to-Voltage Conversion: A transimpedance amplifier (TIA) performs the initial conversion of the weak photocurrent into a usable voltage signal.

Gain Adjustment: To accommodate a wide dynamic range of gas concentrations, a multi-stage amplifier with a digitally controlled potentiometer provides adjustable gain. This ensures the signal properly fills the input range of the ADC without clipping.

Noise Filtering: To maximize the signal-to-noise ratio, a bandpass filter centered at the 10 Hz modulation frequency was implemented. This filter is critical as it selectively passes the desired signal frequency while rejecting broadband noise, effectively isolating the absorption information from unwanted interference.

The final conditioned analog signal is then sampled by the MCU’s integrated ADC for digital processing. The efficacy of this entire signal chain, particularly the filtering, was validated through AC analysis. This multi-step conditioning process ensures that the final digital data is a clean and accurate representation of the gas absorption, ready for feature extraction and classification.

4. Signal Processing and Feature Vector Construction

4.1. Raw Signal Pre-Processing and Denoising

The digitized output of the NDIR system is a periodic waveform whose fundamental frequency corresponds to the modulation frequency of the infrared LEDs. In addition to the useful signal component, the waveform contains broadband electronic noise and low-frequency drift originating from environmental and hardware variations. Direct time-domain analysis is therefore sub-optimal for robust pattern recognition.

To enhance the signal-to-noise ratio and isolate the gas-related information, we adopt a frequency-domain approach based on the Fast Fourier Transform (FFT) [20,21]. For each channel, an FFT is applied to the sampled time series, and the amplitude at the modulation frequency is extracted as the primary feature. This operation effectively suppresses aperiodic noise and DC drift, providing a more stable and discriminative input for the subsequent classifier.

4.2. Feature Vector Construction

For each gas sample, the control system sequentially excites the three measurement LEDs and the reference LED, yielding four time-domain signals that are processed as described above. After FFT, the amplitudes at the modulation frequency from the four channels form a raw four-dimensional feature vector. To compensate for common-mode variations such as changes in overall source intensity, detector aging, or losses in the optical path, the amplitudes of the three measurement channels are normalized by the amplitude of the reference channel. This step results in a final three-dimensional feature vector that is more robust against non-selective disturbances and more sensitive to changes in the relative composition of the gas mixture.

These normalized feature vectors constitute the input to the CNN–SVM model introduced in Section 5.

5. CNN-SVM Model

To effectively discriminate the principal component in mixed alkane gases from the processed spectral data, a hybrid intelligent model that synergistically combines a Convolutional Neural Network (CNN) and a Support Vector Machine (SVM) was developed. This section details the theoretical underpinnings of each component and the architecture of the integrated CNN-SVM model.

5.1. Convolutional Neural Network

Convolutional Neural Networks (CNNs) are a class of deep neural networks particularly renowned for their ability to automatically and adaptively learn spatial hierarchies of features from data [22,23,24,25]. While traditionally applied to image processing, their architecture is highly effective for extracting discriminative patterns from sequential or multi-channel data, such as the feature vectors generated by our NDIR system.

A typical CNN architecture consists of an input layer, one or more convolutional layers, pooling layers, and an output layer [26]. The key innovation of CNNs lies in the convolutional layers, which employ local connectivity and shared weights. Neurons in a convolutional layer are connected only to a local region of the input, allowing the network to learn localized patterns. By sharing weights across the entire input field, the network significantly reduces the number of trainable parameters (w, b), which accelerates training and mitigates the risk of overfitting. The structure of a fundamental CNN model is illustrated in Figure 3.

5.2. Support Vector Machine

To benchmark and complement the neural network’s performance, the Support Vector Machine (SVM), a powerful and widely used supervised learning algorithm, was employed. SVM is particularly effective for classification tasks, demonstrating high accuracy, robust performance with small-to-medium sized datasets, and excellent generalization capabilities by being less prone to overfitting than other models like decision trees.

The core principle of SVM is to find an optimal hyperplane that maximally separates data points of different classes in a high-dimensional feature space. This hyperplane is determined by identifying the support vectors—the data points closest to the decision boundary. The objective of the SVM algorithm is to maximize the margin, which is the distance between the hyperplane and the nearest support vector from either class. A schematic of SVM classification is shown in Figure 4.

For a linearly separable dataset, the distance from the nearest point to the decision boundary is given by:

d i s = (x, b, w) = |\frac{w^{T}}{∥w∥} (x - x^{'})| = \frac{1}{∥w∥} | w^{T} x |,

(1)

Simplifying the distance from the point to the line yields:

d i s = \frac{w^{T} ϕ (x_{i}) + b}{∥ w ∥},

(2)

The optimization objective is therefore to maximize the margin, which is equivalent to minimizing:

a r g \underset{w, b}{m a x} = \{\frac{1}{∥w∥} \underset{i}{m i n} [y_{i} (w^{T} ϕ (x_{i}) + b)]\},

(3)

The constraint

(y_{i} y (x_{i}) \geq 1)

is obtained by performing the deflation transformation, and from the above equation, it is only necessary to consider

a r g \underset{w, b}{m a x} = \frac{1}{∥w∥}

, and choose the appropriate values of

w

and

b

so that

\frac{1}{∥w∥}

reaches the maximum. In order to facilitate the objective function operation, the problem of finding the maximum of

\frac{1}{∥w∥}

is transformed into the problem of finding the minimum of

∥w∥

, which eventually evolves into the value of

\underset{w, b}{m i n} \frac{1}{2} w^{2}

. In order to solve the situation that the existence of noise points in the data leads to the poor effect of classification line, the parameter relaxation factor

ξ

is introduced to make

y_{i} (w x_{i} + b) \geq 1 - ξ_{i}

, called soft interval optimization, and the new objective function is converted to

\underset{w, b}{m i n} \frac{1}{2} ∥ w ∥^{2} + C \sum_{i = 1}^{n} ξ_{i}

, when

C

tends to infinity, it means that the classification is stricter and cannot have errors, and conversely when

C

tends to infinity, it means that it can have greater error tolerance. To solve the problem of linear indivisibility, the kernel function is introduced for dimensional transformation, and the categories that are indivisible in one-dimensional space are transformed to two-dimensional space for segmentation.

5.3. CNN-SVM

To capitalize on the distinct strengths of both models, this study proposes a hybrid CNN-SVM architecture. This approach leverages the CNN’s superior capability for automated feature learning and the SVM’s robustness in performing classification, particularly in high-dimensional feature spaces. It has been demonstrated that replacing the standard fully connected and softmax layers of a CNN with an SVM classifier can improve generalization and overall classification performance [27].

The architecture of our proposed hybrid model is depicted in Figure 5. In this configuration, the CNN component functions as a sophisticated feature extractor. The input feature vectors, derived from the NDIR sensor signals, are processed through a series of convolutional and ReLU activation layers. Instead of proceeding to a fully connected layer for classification, the output from the final convolutional or pooling layer is flattened into a one-dimensional vector. This vector, which represents a rich set of high-level features learned by the CNN, is then fed directly as input to an SVM model, which performs the final classification task.

The workflow of the hybrid model is outlined in Figure 6. The process involves two main stages:

Feature Extraction: A CNN is first trained on the dataset. Through this training, the network’s convolutional layers learn to identify and extract the most salient features that distinguish between the different gas mixture compositions.

Classification: The feature maps generated by the trained CNN are then used as the input dataset to train the SVM classifier. The SVM then learns the optimal decision boundary within this learned feature space to perform the final classification.

6. Results and Analysis

This chapter presents the experimental validation of the proposed NDIR system and the synergistic CNN-SVM model. The section begins by detailing the composition of the dataset used for training and evaluation. Subsequently, it provides a comparative performance analysis of several machine learning models, starting with benchmark algorithms and culminating with the proposed hybrid model. The analysis confirms the superior discriminatory capability of the CNN-SVM architecture for identifying the principal component in mixed alkane gases.

6.1. Experimental Dataset and Pre-Processing

To validate the proposed system, a dataset was generated using mixtures of CH₄ and ethane. These were selected as representative alkanes due to their significant spectral overlap and industrial relevance, which pose a challenging discrimination problem.

The dataset encompasses six distinct mixture scenarios designed to thoroughly evaluate the model’s discriminatory capabilities. Pure gas samples include 80 measurements of 100% CH₄ (±0.5% uncertainty) and 80 measurements of 100% ethane (±0.5% uncertainty), establishing baseline performance benchmarks. For CH₄-dominant mixtures with large margins, we prepared 80 samples containing 80% CH₄ and 20% ethane (±1% uncertainty), while ethane-dominant mixtures with large margins consisted of 80 samples with 20% CH₄ and 80% ethane. To test the system’s sensitivity near decision boundaries, we included challenging near-parity mixtures: 90 samples of CH₄-dominant small margin mixtures (60% CH₄, 40% ethane) and 90 samples of ethane-dominant small margin mixtures (40% CH₄, 60% ethane). The absorption information of the four light sources under several different gas samples is shown in Figure 7.

Our experimental dataset consists of 500 measurements, carefully balanced with 250 CH₄-dominant samples and 250 ethane-dominant samples to prevent classification bias. The dataset was partitioned using an 80/20 split. The classification task was defined as a binary problem: to identify whether CH₄ (labeled as Class 1) or ethane (labeled as Class 2) was the principal component in the gas sample.

6.2. Performance of Benchmark Classification Models

To establish a performance baseline, the dataset was first used to train and evaluate several established machine learning algorithms.

6.2.1. Backpropagation Neural Network (BPNN)

A conventional Backpropagation Neural Network (BPNN) was implemented as an initial benchmark. Through empirical testing, the optimal network architecture was determined to comprise three hidden layers with 6, 3, and 3 neurons, respectively. The model was trained to classify whether CH₄ was the principal component.

The performance of the trained BPNN is summarized by the confusion matrices in Figure 8. The overall classification accuracy on the test set was 77.3%. The Receiver Operating Characteristic (ROC) curve analysis, shown in Figure 9, yields an Area Under the Curve (AUC) value significantly above 0.5, indicating that the classifier’s performance is substantially better than random chance and provides a valid benchmark.

6.2.2. Standalone CNN and SVM Models

Further benchmarking was conducted using standalone Convolutional Neural Network (CNN) and Support Vector Machine (SVM) models.

The CNN model, leveraging its inherent capability for automated feature extraction from structured data, demonstrated a notable improvement over the BPNN. As shown in the confusion matrix in Figure 10, the CNN achieved a classification accuracy of 87% on the test set. This result highlights the advantage of convolutional layers in identifying salient patterns within the multi-channel sensor data.

Similarly, an SVM classifier was trained on the same feature vectors. The SVM, a powerful algorithm known for its robust performance in high-dimensional spaces, yielded a competitive test accuracy of 86% (Figure 11). This underscores the effectiveness of kernel-based methods for this classification task and confirms the feature vectors contain sufficient discriminatory information.

6.3. Performance of the Proposed CNN-SVM Hybrid Model

The proposed hybrid CNN-SVM model, designed to synergistically combine the feature extraction prowess of a CNN with the robust classification of an SVM, was then evaluated. In this architecture, the trained convolutional layers serve as an adaptive feature extractor, and the resulting high-level feature maps are classified by the SVM.

As detailed in the confusion matrix in Figure 12, this hybrid model achieved a superior test accuracy of 89%. This result validates the central hypothesis of this study: decoupling the feature learning and classification stages into specialized components enhances overall discriminatory power.

6.4. Comparison with Other Methods

A comparative analysis against other traditional machine learning algorithms, including Logistic Regression, Bayesian classifiers, Decision Trees, and K-Nearest Neighbors (KNN), further solidified the superiority of the proposed model. As summarized in Table 1, the CNN-SVM model consistently outperformed all other tested algorithms, establishing its efficacy for this challenging gas discrimination task.

The above results show that CNN-SVM model achieves near-perfect identification for pure gases and for mixtures with a clearly dominant component. However, the rate of misclassification increases when the concentrations of CH₄ and ethane approach parity. This observation suggests that while the learned features are highly discriminative for imbalanced mixtures, the subtle spectral differences in near-equal compositions represent the primary frontier of the classification challenge. Nonetheless, the 89% accuracy achieved represents a significant advancement in using NDIR systems for complex alkane mixture analysis.

7. Summary

In order to address the issue that it is challenging to distinguish the primary components of mixed alkane gases, this paper chooses the non-split infrared gas detection method, designs a reference detection method based on the infrared gas absorption principle with four light sources and single sensors, and reasonably designs the optical part and circuit part of the detection system. It uses CH₄ and ethane as the object of study, and obtains the classification accuracy. The CNN and SVM algorithms are combined to create a neural network model with a high rate of correct CH₄ discrimination, and the classification accuracy reaches 89%, which is a good reference value.

It is important to note that the method proposed in this article is intended as a supplemental monitoring tool or a first-stage screening device in a multi-layered safety system, not a standalone safety tool. In high-risk industrial environments, such as underground mining operations, petrochemical facilities, or confined spaces where CH₄ accumulation poses a risk of explosion, higher accuracy is required given the potentially catastrophic consequences.

To address the gaps in practical deployment in safety-critical scenarios, the following optimization strategies will be considered in the future. a. Enhanced Training Strategy: Implementing data augmentation techniques and collecting larger and more diverse datasets encompassing extreme environmental conditions (temperature variations, humidity, and pressure changes) will improve model robustness. b. Ensemble Approach: Developing a voting system that combines multiple machine learning models with different architectures to reduce classification uncertainty and provide a confidence metric for each prediction. c. Hybrid Redundancy: Combining algorithmic approaches with selective hardware redundancy.

The value of this manuscript lies in demonstrating the feasibility of replacing expensive optical filters with intelligent algorithms. This represents a significant advancement, making NDIR technology more accessible to a wide range of industrial monitoring applications where cost-effectiveness is important and moderate accuracy is acceptable. Future work will extend the dataset to include additional alkane species such as propane and butane, as well as more complex multi-component mixtures, to further validate the scalability and robustness of the proposed algorithmic filter approach across a broader range of industrial gas analysis scenarios.

Author Contributions

J.Z.: Conception, visualization, modeling, writing, and revision of the manuscript. Z.Z.: Writing, modeling, and revision of the manuscript. F.P.: Visualization, computation and writing of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.

Acknowledgments

Thank you to the reviewers for their suggestions on this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ijzermans, R.; Jones, M.; Weidmann, D.; van de Kerkhof, B.; Randell, D. Long-term continuous monitoring of CH₄ emissions at an oil and gas facility using a multi-open-path laser dispersion spectrometer. Sci. Rep. 2024, 14, 623. [Google Scholar] [CrossRef]
Hu, K.; Liu, B.; Pang, Y.; Li, R.; Zhang, Q.; Shi, J.; Liu, H. H₂ and CH₄ adsorption on coal: Insights from experiment and mathematical model. Int. J. Hydrogen Energy 2025, 120, 542–557. [Google Scholar] [CrossRef]
Selman, J.; Spickett, J.; Jansz, J.; Mullins, B. An investigation into the rate and mechanism of incident of work-related confined space fatalities. Saf. Sci. 2018, 109, 333–343. [Google Scholar] [CrossRef]
Qiao, Z.; Li, Y.; Miao, Q.; Ma, H.; Xu, L.; Li, R. Experimental Investigation of Obstruction Effects on C₃H₈/H₂ Hybrid Fuel Explosion Dynamics in Semiconfined Pipelines. ACS Omega 2025, 10, 35954–35964. [Google Scholar] [CrossRef] [PubMed]
Saunois, M.; Stavert, A.R.; Poulter, B.; Bousquet, P.; Canadell, J.G.; Jackson, R.B.; Raymond, P.A.; Dlugokencky, E.J.; Houweling, S.; Patra, P.K.; et al. The Global CH₄ Budget 2000–2017. Earth Syst. Sci. Data Discuss. 2020, 12, 1561–1623. [Google Scholar] [CrossRef]
Ho, Q.D.; Rauls, E. Cavity Size Effects on the Adsorption of CO₂ on Pillar[n]arene Structures: A Density Functional Theory Study. ChemistrySelect 2023, 8, e202302266. [Google Scholar] [CrossRef]
Ho, Q.D.; Rauls, E. Ab initio study: Investigating the adsorption behaviors of polarized greenhouse gas molecules on pillar[5]arenes. Mater. Today Commun. 2023, 36, 106875. [Google Scholar] [CrossRef]
Ho, Q.D.; Rauls, E. Investigations of Functional Groups Effect on CO₂ Adsorption on Pillar[5]arenes Using Density Functional Theory Calculations. ChemistrySelect 2024, 9, e202401490. [Google Scholar]
Alrefae, M.; Es-sebbar, E.T.; Farooq, A. Absorption cross-section measurements of CH₄, ethane, ethylene and methanol at high temperatures. J. Mol. Spectrosc. 2014, 303, 8–14. [Google Scholar] [CrossRef]
Li, X.; Wang, H.; He, Y.; Gao, Z.; Zhang, X.; Wang, Y. Active Thermography Nondestructive Testing Going Beyond Camera’s Resolution Limitation: A Heterogenous Dual-Band Single-Pixel Approach. IEEE Trans. Instrum. Meas. 2025, 74, 1–8. [Google Scholar] [CrossRef]
Dehnaw, A.M.; Lu, Y.-J.; Shih, J.-H.; Yao, C.-K.; Bitew, M.A.; Peng, P.-C. Deep Neural Network Optimization for Efficient Gas Detection Systems in Edge Intelligence Environments. Processes 2024, 12, 2638. [Google Scholar] [CrossRef]
Mharakurwa, E.T.; Nyakoe, G.N.; Akumu, A.O. Power Transformer Fault Severity Estimation Based on Dissolved Gas Analysis and Energy of Fault Formation Technique. J. Electr. Comput. Eng. 2019, 2019, 9674054. [Google Scholar] [CrossRef]
Yu, Q.; Tang, Z.; Zhang, X.; Fan, B.; Zhang, Z.; Chen, Z. Seismic Performance of Fully Prefabricated L-Shaped Shear Walls with Grouted Sleeve Lapping Connectors under High Axial Compression Ratio. Appl. Sci. 2023, 13, 2301. [Google Scholar] [CrossRef]
Ye, W.; Tu, Z.; Xiao, X.; Simeone, A.; Yan, J.; Wu, T.; Wu, F.; Zheng, C.; Tittel, F.K. A NDIR Mid-Infrared CH₄ Sensor with a Compact Pentahedron Gas-Cell. Sensors 2020, 20, 5461. [Google Scholar] [CrossRef]
Wang, C.-C.; Chien, C.-H. Machine Learning for Industrial Optimization and Predictive Control: A Patent-Based Perspective with a Focus on Taiwan’s High-Tech Manufacturing. Processes 2025, 13, 2256. [Google Scholar] [CrossRef]
Jha, R.K. Non-Dispersive Infrared Gas Sensing Technology: A Review. IEEE Sens. J. 2022, 22, 6–15. [Google Scholar] [CrossRef]
Shen, C.-H.; Wu, J.-J. A New Electro-Optical-Thermal Modelling for Non-Dispersive IR Sensing Technique of Gas Concentration. Appl. Sci. 2022, 12, 7772. [Google Scholar] [CrossRef]
Zheng, C.; Pi, M.; Song, F.; Li, Y.; Peng, Z.; Guan, G.; Zhang, L.; Ma, Y.; Min, Y.; Ye, W.; et al. Recent Progress in Infrared Absorption Spectroscopy for Gas Sensing with Discrete Optics, Hollow-Core Fibers and On-Chip Waveguides. J. Light. Technol. 2023, 41, 4079–4096. [Google Scholar] [CrossRef]
Romanini, D.; Chenevier, M.; Kassi, S.; Schmidt, M.; Valant, C.; Ramonet, M.; Lopez, J.; Jost, H.J. Optical–feedback cavity–enhanced absorption: A compact spectrometer for real–time measurement of atmospheric CH₄. Appl. Phys. B 2006, 83, 659–667. [Google Scholar] [CrossRef]
Jiang, W.; Li, P.; Chang, Z.; Yuan, L.; Bai, Y. Research on the Application of Fast Fourier Transform in Harmonic and Inter-Harmonic Detection of New Power System. In Proceedings of the 2024 International Conference on the Frontiers of Electronic, Electrical and Information Engineering (ICFEEIE), Huangshan, China, 21–23 June 2024; pp. 139–144. [Google Scholar]
Zhang, Q.; Jeong, W.; Kang, D.J. Lock-in amplifiers as a platform for weak signal measurements: Development and applications. Curr. Appl. Phys. 2024, 66, 95–109. [Google Scholar] [CrossRef]
Xu, J.; Li, X.; Lu, W.; Wei, X.; Chen, G.; Li, Y. A heterogeneous two-layer graph convolution model for turning traffic prediction with missing data. Transp. B Transp. Dyn. 2025, 13, 2497941. [Google Scholar] [CrossRef]
Gao, J.; Wang, C.; Hao, Y.; Liang, X.; Zhao, K. Prediction of TC11 single-track geometry in laser metal deposition based on back propagation neural network and random forest. J. Mech. Sci. Technol. 2022, 36, 1417–1425. [Google Scholar] [CrossRef]
Xu, J.; Lu, W.; Li, Y.; Zhu, C.; Li, Y. A multi-directional recurrent graph convolutional network model for reconstructing traffic spatiotemporal diagram. Transp. Lett. 2023, 16, 405–415. [Google Scholar] [CrossRef]
Jin, L.; Song, E.; Zhang, W. Denoising Color Images Based on Local Orientation Estimation and CNN Classifier. J. Math. Imaging Vis. 2020, 62, 505–531. [Google Scholar] [CrossRef]
Xu, J.; Li, X.; Lu, W.; Rakotonirainy, A.; Li, Y. A trajectory-conditional generative adversarial network model for missing vehicle trajectory imputation. Phys. A Stat. Mech. Its Appl. 2025, 676, 130881. [Google Scholar] [CrossRef]
Gnanamalar, A.J.; Bhavani, R.; Arulini, A.S.; Veerraju, M.S. CNN–SVM Based Fault Detection, Classification and Location of Multi-terminal VSC–HVDC System. J. Electr. Eng. Technol. 2023, 18, 3335–3347. [Google Scholar] [CrossRef]
de Menezes, F.S.; Liska, G.R.; Cirillo, M.A.; Vivanco, M.J.F. Data classification with binary response through the Boosting algorithm and logistic regression. Expert Syst. Appl. 2017, 69, 62–73. [Google Scholar] [CrossRef]
Bielza, C.; Li, G.; Larrañaga, P. Multi-dimensional classification with Bayesian networks. Int. J. Approx. Reason. 2011, 52, 705–727. [Google Scholar] [CrossRef]
Gupta, B.; Rawat, A.; Jain, A.; Arora, A.; Dhami, N. Analysis of various decision tree algorithms for classification in data mining. Int. J. Comput. Appl. 2017, 163, 15–19. [Google Scholar] [CrossRef]
Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN Classification with Different Numbers of Nearest Neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 1774–1785. [Google Scholar] [CrossRef] [PubMed]
Zhang, D.; Lou, S. The application research of neural network and BP algorithm in stock price pattern classification and prediction. Future Gener. Comput. Syst. 2021, 115, 872–879. [Google Scholar] [CrossRef]
Vaidya, J.; Yu, H.; Jiang, X. Privacy-preserving SVM classification. Knowl. Inf. Syst. 2008, 14, 161–178. [Google Scholar] [CrossRef]
Lu, J.; Tan, L.; Jiang, H. Review on Convolutional Neural Network (CNN) Applied to Plant Leaf Disease Classification. Agriculture 2021, 11, 707. [Google Scholar] [CrossRef]

Figure 1. Structure of NDIR gas reference detection.

Figure 2. System architecture diagram.

Figure 3. The structure of a fundamental CNN.

Figure 4. Schematic of SVM classification.

Figure 5. Network structure.

Figure 6. Algorithm flow chart.

Figure 7. Detection results of different gas samples using different light sources.

Figure 8. Confusion matrix of training data.

Figure 9. The results of ROC.

Figure 10. Confusion matrix results for CNN classification.

Figure 11. The confusion matrix of prediction results.

Figure 12. Classification confusion matrix.

Table 1. Algorithm classification accuracy.

Machine Learning Algorithms	Classification Accuracy
Logistic regression [28]	0.8009
Bayesian [29]	0.8405
Decision Trees [30]	0.7118
KNN [31]	0.8009
BPNN [32]	0.7730
SVM [33]	0.8600
CNN [34]	0.8700
CNN-SVM	0.8900

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Z.; Zhu, J.; Pan, F. An NDIR System with a Synergistic CNN-SVM Model for Discriminating CH₄ in Complex Alkane Mixtures. Processes 2025, 13, 3948. https://doi.org/10.3390/pr13123948

AMA Style

Zhang Z, Zhu J, Pan F. An NDIR System with a Synergistic CNN-SVM Model for Discriminating CH₄ in Complex Alkane Mixtures. Processes. 2025; 13(12):3948. https://doi.org/10.3390/pr13123948

Chicago/Turabian Style

Zhang, Zhaoliang, Juxiang Zhu, and Fei Pan. 2025. "An NDIR System with a Synergistic CNN-SVM Model for Discriminating CH₄ in Complex Alkane Mixtures" Processes 13, no. 12: 3948. https://doi.org/10.3390/pr13123948

APA Style

Zhang, Z., Zhu, J., & Pan, F. (2025). An NDIR System with a Synergistic CNN-SVM Model for Discriminating CH₄ in Complex Alkane Mixtures. Processes, 13(12), 3948. https://doi.org/10.3390/pr13123948

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu