1. Introduction
Symmetry plays a fundamental role in signal representation and analysis. Many industrial signals exhibit symmetrical properties in the time or frequency domains, and such structures are often preserved in classical linear spectrogram representations. However, the presence of noise, operational variability, or transient anomalies can disrupt these symmetries. In this work, modifications for symmetric representation of power spectrograms using nonlinear quantization strategies are deliberately proposed. By changing the uniformity (symmetry) of the quantization levels, it was possible to emphasize selected signal components, allowing for a better view of the underlying processes.
A spectrogram represents the changes in the signal spectrum over time and forms the basis of time–frequency signal analysis. The types and methods of obtaining spectrograms are well studied in [
1]. They are commonly used for examining short-term spectra of non-stationary or transient signals. Frequency-domain signal processing algorithms can be found in [
2]. In other words, when analyzing signals whose characteristics vary rapidly over time, it is often useful to consider the frequency content of short segments of the signal. This approach allows the formulation of a more general spectrum as a two-dimensional function dependent not only on frequency but also on time position.
While the classical Fourier transform operates with signals of infinite duration, practical analysis always relies on finite-length signal segments.
The objective here differs from that of classical frequency analysis, where the goal is to determine the amplitude and phase of harmonic components with unlimited duration. Instead, the aim is to achieve the most accurate localization of the occurrence of signal components in both the frequency and time domains. This different objective also implies a different methodology for designing observation windows, or in other words, selecting the length of the signal segment.
In classical analysis, it is advantageous to extend the segment length to minimize distortion. However, in time–frequency analysis, the observation window length is determined by a compromise between two conflicting requirements: sufficient frequency resolution (the smallest discernible frequency difference is inversely proportional to the signal length) and high temporal resolution (the smallest discernible time difference is directly proportional to the signal length). Spectrograms effectively capture signals’ time and frequency characteristics [
3,
4].
In practice, one of these requirements usually dominates, and the signal length, or more specifically the number of samples
N at a given sampling period
, must be adjusted accordingly. By applying the discrete Fourier transform (DFT) to such defined signal segments, short-time spectra are obtained, characterizing the frequency content and phase relationships of the signal in the respective time window [
5,
6,
7].
In the evaluation of processes that generate signals of varying characteristics, time–frequency analysis can be beneficial in examining changes in spectral properties. It reveals the influence of process parameter changes, nonlinearities, or inconsistencies. A significant advantage is that the spectrogram provides visual information, enabling the use of specific evaluation techniques, such as geometric and morphological methods.
Within the scope of this paper, two signal spectrogram processing algorithms were investigated:
The objective of experimenting with quantizers was to explore visualization strategies for spectrograms that would highlight certain specific properties of industrial signals. These properties are often indicative of the current operational modes of complex processes.
The main objective of this research is to design and analyze nonlinear quantizers for spectrogram visualization, aiming to enhance the sensitivity of spectral components relevant for industrial diagnostics. The performance is evaluated both qualitatively and via entropy-based metrics and histograms.
2. Review of Related Literature
Signal processing research has witnessed substantial progress in recent years, particularly in the areas of time–frequency analysis, spectral representation, and machine learning-based interpretation. Numerous studies have explored advanced transformations, deep learning models, and quantization techniques to improve the accuracy, efficiency, and interpretability of signal representations in various domains such as radar sensing, biomedical diagnostics, audio classification, and industrial monitoring. This section provides an overview of recent contributions relevant to our work, highlighting key methodological innovations and positioning our study within the broader context of spectral signal analysis.
Ma and Chen [
1] proposed an innovative approach to radar target recognition based on micro-Doppler signals. Their method employs a one-dimensional cascaded denoising architecture comprising two main components. First, a Denoising Autoencoder (DAE) is used to suppress noise in the micro-Doppler signals, enhancing the quality of input data for subsequent classification. Second, a dedicated classification module identifies target types based on features extracted from the denoised signals. Their system was validated using both simulated and real radar measurements and achieved high accuracy in recognizing various target types. In contrast, our study focuses on the quantization of power spectra and the design of nonlinear quantizers, aimed at improving the visualization and interpretability of spectral data. Specifically, we model quantizer transfer functions using linear and quadratic functions to tailor resolution across different spectral regions, and we analyze quantization errors to assess their impact on spectral fidelity. Unlike the deep learning-based approach of [
1], which processes time-domain signals for classification, our work emphasizes mathematical modeling and the analytical study of quantization in the context of signal processing. Both studies contribute to the advancement of the field but approach the problem from complementary angles—machine learning-driven recognition versus analytical enhancement of signal representation.
Recently, Ghobber et al. [
2] introduced an extension of the classical Gabor transform by integrating it with the Dunkl–Bessel transform. This hybrid approach enables the analysis of signals while preserving inherent symmetries found in certain physical and mathematical systems. The authors formulated a new version of the Gabor transform associated with the Dunkl–Bessel framework, effectively combining the local time–frequency analysis of the Gabor transform with the symmetry-aware structure of the Dunkl–Bessel operator. They conducted a detailed theoretical investigation of the new transform’s properties, including linearity, time–frequency resolution, and reconstruction capabilities—features critical to signal processing applications. Furthermore, they applied the transform to generate spectrograms that visually represent the time–frequency content of signals, with a specific emphasis on preserving symmetrical structures. In contrast to [
2], our study focuses on optimizing quantization processes to enhance the visual representation of spectral information. While both approaches contribute to improved time–frequency analysis, our work concentrates on quantizer design and evaluation rather than transform theory and symmetry exploitation.
In another recent study, Orozco et al. [
3] proposed an innovative approach to heart sound classification by combining deep learning with joint time–frequency representations. The work highlights the effectiveness of deep learning in bioacoustic analysis and explores advanced feature extraction techniques to enhance classification accuracy. Specifically, the authors employed multiple time–frequency transformations, including the short-time Fourier transform (STFT) and wavelet transform, to extract relevant features from heart sound recordings. These features were then processed using convolutional neural networks (CNNs) to identify different types of heart sounds. To further improve model performance, preprocessing techniques such as normalization and segmentation were applied to the raw audio signals. The proposed method was evaluated using publicly available heart sound databases and demonstrated high classification accuracy. While this study focuses on biomedical audio classification using deep learning, our research aims to improve the visual interpretability of spectral data through optimized quantization techniques. Both works underscore the importance of high-quality spectral representations but pursue this goal from different methodological perspectives.
Scarpiniti et al. [
4] presented an innovative method for arrhythmia detection based on the fusion of time–frequency representations of electrocardiogram (ECG) signals. Their approach leverages both scalograms (the magnitude of the continuous wavelet transform; CWT) and phasograms (the phase of the same transform) to enhance classification performance. The core methodology involves applying CWT to ECG signals to generate complementary time–frequency images, which are then analyzed using convolutional neural networks (CNNs). The study explores multiple fusion strategies—at the input level, intermediate network layers, and the output layer—to effectively integrate magnitude and phase information. Experimental evaluation on the MIT-BIH Arrhythmia Database showed that the combination of scalogram and phasogram data significantly improved classification accuracy compared to using magnitude-only inputs. While this research highlights the power of deep learning and feature fusion in biomedical signal classification, our work pursues a different objective: optimizing quantization processes to enhance the spectral representation and visual interpretability of industrial signals. Nevertheless, both studies underscore the importance of robust time–frequency representations in modern signal processing applications.
Liu et al. [
5] proposed a method for accurately estimating the frequency of real-valued sinusoidal signals using the discrete Fourier transform (DFT). Their work focuses on enhancing frequency resolution without increasing computational burden. The method exploits the relationship between the amplitudes of adjacent DFT coefficients to analytically derive frequency estimates, accounting for the effects of windowing and spectral interpolation. By avoiding traditional techniques such as zero-padding or time-domain interpolation, the approach achieves high accuracy with significantly reduced computational complexity. This makes it particularly suitable for applications requiring real-time or resource-constrained signal analysis. While Liu et al. address precision frequency estimation in narrowband signals, our study centers on improving spectral data representation through optimized quantization methods. The two approaches highlight different but complementary directions in spectral signal processing—accurate estimation versus visual and structural enhancement.
Zhang et al. [
7] proposed a novel approach to channel estimation in wireless communication systems by combining the discrete Fourier transform (DFT) and discrete wavelet transform (DWT). Their algorithm integrates the strengths of both transforms to improve estimation accuracy and computational efficiency. Specifically, DFT is applied to transform the received signal into the frequency domain, enabling the identification of dominant spectral components and facilitating the initial estimation of the channel’s characteristics. Subsequently, DWT is used to denoise the estimated channel impulse response, enhancing precision by suppressing unwanted noise. Simulation results demonstrate that the proposed method outperforms conventional techniques such as least squares (LS) and minimum mean square error (MMSE) estimators, achieving higher accuracy with lower computational complexity. While their work targets communication channel modeling, our research addresses the visual and analytical enhancement of spectral representations through quantization. Both approaches reflect the growing role of hybrid transformation techniques in modern signal processing.
Kim et al. [
8] investigated the detection of abnormal symptoms using deep learning techniques applied to acoustic spectrograms. The proposed system employs short-time Fourier transform (STFT) to convert audio signals into spectrograms, which are then analyzed using convolutional neural networks (CNNs). The process involves three key stages: data acquisition and preprocessing, where recorded audio signals are transformed into time–frequency representations; deep learning-based classification, in which CNNs are trained to distinguish between normal and abnormal spectrogram patterns; and performance evaluation using metrics such as accuracy, sensitivity, and specificity. This approach enables automated detection of acoustic anomalies and is applicable in domains such as health monitoring and industrial inspection. While Kim et al. focus on anomaly detection through supervised learning on spectrogram images, our study aims to enhance the interpretability of spectrograms themselves through nonlinear quantization, thereby supporting improved visual analysis and potential integration with diagnostic systems.
In their recent study, Chen et al. [
9] proposed a classification framework based on multimodal processing of audiovisual information. The system integrates audio and visual data streams to enhance classification accuracy in complex scenarios. The methodology consists of four main stages: data acquisition and synchronization of audio and video signals; feature extraction using techniques such as Mel-frequency cepstral coefficients (MFCCs) for audio and visual descriptors for video frames; data fusion through convolutional neural networks (CNNs) to combine complementary information from both modalities; and model training and evaluation using standard performance metrics including accuracy, sensitivity, and specificity. This approach is particularly effective in applications where both sound and visual cues are available, such as behavior monitoring or health diagnostics. While Chen et al. [
9] leverage multimodal fusion to improve classification, our work focuses on the enhancement of spectral representations through quantization techniques, particularly targeting improved interpretability and resolution in signal visualization tasks.
Chen et al. [
10] explored a novel deep learning-based approach for predicting the remaining useful life (RUL) of rolling element bearings using time–frequency image features. The authors proposed an end-to-end adaptive framework that leverages time–frequency representations of vibration signals to enhance predictive performance. The process begins with signal transformation using methods such as short-time Fourier transform (STFT) to convert raw vibration data into time–frequency images. These images are then analyzed using convolutional neural networks (CNNs) to extract salient features relevant to bearing degradation. Finally, regression models are applied to estimate the RUL based on the extracted features. The approach enables more accurate and efficient RUL prediction, which is essential for predictive maintenance strategies and reducing repair costs. While this work focuses on prognostics in mechanical systems through deep learning and time–frequency feature engineering, our study emphasizes enhancing the quality and interpretability of spectral representations through quantization, supporting broader diagnostics and visualization tasks in industrial signal processing.
In their recent study, Chen et al. [
11] introduced an innovative approach to sound classification in noisy environments by combining advanced deep learning techniques. The proposed method integrates an enhanced Patch-Mix Transformer architecture with contrastive learning to improve model robustness under high-noise conditions. The Patch-Mix Transformer processes audio signals by segmenting them into smaller patches and encoding them through a Transformer-based network, enabling the model to effectively capture relevant information even in the presence of significant background noise. Contrastive learning is employed to help the model distinguish between sound classes by emphasizing the differences between similar and dissimilar samples, thereby enhancing generalization and discriminative performance. The model was trained and evaluated on benchmark audio data sets with varying noise levels, achieving superior classification accuracy compared to existing approaches. While Chen et al. focus on improving auditory classification under adverse acoustic conditions, our work targets the optimization of spectrogram-based representations through quantization techniques, which can complement classification tasks by providing clearer, more structured spectral input data.
Liu et al. [
12] investigated the optimization of Detection Transformer (DETR) models through efficient integer quantization. Their method focuses on compressing DETR-based object detection models by converting both weights and activations from floating-point to integer representations. This transformation significantly reduces memory usage and inference time, making the models more suitable for deployment on resource-constrained platforms such as mobile devices and edge computing systems. The approach maintains competitive detection accuracy while substantially decreasing model size, enabling practical application of Transformer-based detectors outside of high-performance computing environments. In contrast to their focus on integer quantization for deep learning model deployment, our study centers on the mathematical design of quantizers to enhance the representation of spectral information in signal processing. While both works explore the benefits of quantization, Liu et al. apply it to improve runtime efficiency, whereas our research leverages it to improve interpretability and fidelity in spectral data visualization.
Yao et al. [
13] proposed a novel approach to nonlinear quantization of synthetic-aperture radar (SAR) images, aiming to improve the signal-to-noise ratio (SNR) and enhance segmentation performance. Their method introduces quantization functions specifically adapted to the statistical characteristics of SAR data, effectively reducing quantization noise and increasing the SNR. The quantization parameters are optimized to maximize the SNR, resulting in improved image clarity and detail preservation. In addition, a segmentation strategy is incorporated to identify homogeneous regions within SAR images, enabling locally adaptive quantization tailored to different image areas. This combination of nonlinear quantization and spatial segmentation enhances both the objective and perceptual quality of SAR image processing. While Yao et al. focus on radar image enhancement and segmentation through customized quantization, our work addresses spectral data visualization in the context of signal processing, with an emphasis on the mathematical modeling and evaluation of quantizer designs. Both studies underscore the value of nonlinear quantization for improving interpretability in high-dimensional signal representations.
In summary, the reviewed literature demonstrates the growing importance of combining mathematical modeling, signal transformation, and deep learning to enhance signal analysis and interpretation. While several studies focus on classification, detection, and prediction tasks using learned features from time–frequency representations, our work emphasizes the optimization of quantization processes to improve the visual clarity and structural fidelity of spectral data. This complementary perspective contributes to a more robust foundation for signal diagnostics, especially in industrial and monitoring applications.
4. Proposal of Methods
The spectrogram of a signal
is a sequence of spectra calculated consecutively. In signal processing, power spectra are typically computed. Therefore, the spectrogram of a general signal
can be expressed as a time sequence of its power spectra (
20):
These power spectra have the structure of vector elements (
21):
The components
of the power spectrum represent ensemble mean values associated with the
j-th time instant. These values are estimated based on the average over a finite number
N of signal realizations (
22):
where
is the mean value–mathematical expectation, taking into account statistical averages;
is the power of the complex Fourier component, i.e., the spectral energy at frequency
; and
N is the number of realizations. The result represents the time–frequency content of the signal for the frequency component
.
Here, denotes the k-th Fourier coefficient of the spectrum corresponding to the i-th realization of the signal at the time step j, and n is the number of samples in that realization.
The spectrogram can be understood as a highly integrative source of information. In [
8,
9], spectrograms were presented as a powerful source of information.
The graphical representation of a spectrogram takes the form of a matrix of image elements , where the brightness level corresponds to the power level of the k-th frequency component of the signal at the j-th time instant.
The conversion of the spectrogram into a matrix of image elements with appropriate brightness levels is performed by a converter, implemented as a specialized algorithm. They are used in diagnostics for classifying machine equipment failures [
10,
14]. They are a tool in the processing of speech and sound [
11,
15,
16]. It is important to note that this involves the conversion of a spectrogram with finite resolution in time, frequency, and amplitude (i.e., a result of processing a time- and amplitude-discrete signal) into a 2D image with finite spatial and brightness resolution (i.e., a binary-coded raster image) [
17,
18,
19]. The converter, in this sense, consists of two subsystems: a quantizer and a coder (see
Figure 4).
Figure 4 shows a block diagram of a converter that transforms a set of spectral data into coded outputs. This process is divided into two main stages: quantization and coding.
Step 1: Quantization—“quantizer” block:
The input data set is quantized to a discrete representative .
The quantization is based on the training set , which determines the quantization.
The result of quantization is a representative from the quantized data set .
Step 2: Coding—block “coder”:
The quantized vector is further processed by the encoder, which transforms it.
The output set contains discrete codes, which can be binary, hexadecimal, or another format depending on the application.
The overall function of the converter block performs the following:
Transforms spectral inputs into a compact encoded form suitable for further processing in classification, data transmission, or storage.
Reduces the size of input data, simplifies representation, and increases efficiency.
This type of scheme is typical in the field of cluster analysis or automatic classification systems, where complex signals such as spectrograms are first quantized and then encoded.
The elements of the power spectrum of a general signal take real values within the interval . This infinite set of real numbers can be denoted by the symbol .
Based on a predefined transfer characteristic (see
Figure 5), the quantizer assigns a corresponding quantized power level to the theoretically continuous value
, which represents the power of the
k-th frequency component at the
j-th time instant.
This assigned quantization level is selected by the quantizer from an
-element discrete set
containing all distinguishable quantization levels. The assignment is based on a set of suitably defined decision levels
[
12,
13].
Each quantization level (and, accordingly, its corresponding brightness level) is associated with a specific range of measured power values on the transfer characteristic. This range is always defined by a triplet of decision levels: , , and .
The quantizer algorithm then selects the decision level
that is closest to the measured power value
, according to condition (
23):
Subsequently, the corresponding quantization level is assigned. The purpose of this power quantization is to convert the amplitude-continuous power values into amplitude-discrete values with a given number of resolution levels.
The described quantizer algorithm can be better understood based on the graphical representation of its transfer function
(see
Figure 5). The properties of the quantizer are determined by the distances (differences) between the individual decision levels. In general, these distances may vary across the entire range of the transfer function.
If the distance between two neighboring decision levels
is denoted by the symbol
, referred to as the quantization step, then the following relation holds (
24):
The condition (
23) for selecting the nearest decision level can then also be expressed in the form of (
25):
Alternatively, the same condition can be reformulated as follows (
26):
This equation represents the quantization level assignment criterion for nonlinear quantization, where is the actual power value, are the decision levels of the quantizer, and are quantization steps that can vary—this is a nonlinear quantizer.
If the signal value lies between two boundaries around , this decision level is selected and a quantization level is assigned to it, which corresponds to the brightness level in the image.
A quantizer with nonlinear steps allows you to adjust the resolution in different parts of the power range—for example, to enhance details at higher powers or suppress weak noise components.
The second subsystem of the spectrogram-to-image converter is the coder. Its task is to assign a corresponding brightness level to the quantized power of the frequency component at time instant j for the respective spectrogram image element.
Let
be the set of all
possible brightness levels (i.e., the grayscale palette) used to render the spectrogram. Then, the coder mapping
K can be defined as an injective function from
to
(
27),
which assigns to each distinguishable quantized power level
exactly one brightness level
.
The lowest quantization level is assigned the minimum brightness level (i.e., black), while the highest level is assigned the maximum brightness level (i.e., white). Thus, the role of the coder is to map the quantized power level produced by the quantizer to a corresponding brightness value for the image element .
The number of distinguishable brightness levels (and therefore quantization levels) is determined by the expression , where denotes the number of bits used to encode the brightness level of an image element, also referred to as the resolution.
From the described spectrogram quantization algorithm, it follows that one of its key parameters is the quantization step , which defines the spacing between two adjacent decision levels of the quantizer’s transfer function.
In industrial practice, a linear quantizer is typically used for spectrogram rendering. In this case, the quantization step
is constant across the entire dynamic range of the quantizer, and the following relation holds (
28):
This constant step applies to both the decision axis and the quantization axis of the quantizer (see
Figure 6).
A constant quantization step in a linear quantizer ensures uniform sensitivity when displaying spectrograms, both at high and low power levels. However, the relative error of the display becomes greater at low signal-power levels compared to higher ones.
On the other hand, a nonlinear quantizer does not maintain a constant quantization step
across the entire range of its decision axis. This allows for selective sensitivity: certain signal power levels can be displayed with higher precision, and others with lower sensitivity (see
Figure 7).
If the graphical representation of the spectrogram
,
of the signal
is defined as a matrix
of size
(
Figure 8), then the elements
of matrix
represent the brightness levels assigned to the corresponding image elements
based on the previously described converter algorithm. Thus, for the matrix elements,
Figure 8 shows a time–frequency representation of a signal, namely, a spectral matrix or spectrogram, which visualizes how the energy of a signal varies with time and frequency. This typical visualization analyzes nonlinear, non-stationary, or multi-component signals.
Brighter brightness levels indicate higher power at a given frequency and time. Visible horizontal bands indicate the presence of dominant frequencies that are stable in time (e.g., in the bands around 1 Hz, 4 Hz, 8 Hz, 18 Hz, 13 Hz).
The dominant frequencies can represent fundamental harmonic signals or the system’s resonant frequencies.
Such a spectrogram can be transformed (e.g., by quantization) and encoded as input to classification algorithms (as indicated in the previous diagram).
This spectrogram visualizes the spectral matrix . It highlights how the signal’s frequency content varies with time and serves as a basis for compression, quantization, and signal classification algorithms.
Just like in analog-to-digital converters, it is also possible to define quantization error parameters for the spectrogram-to-image converter.
During the quantization of power
, an absolute quantization error is introduced, defined as the difference between the assigned decision level
and the true (ideal) value
(
Figure 9). Thus, the absolute quantization error is
It can be concluded that the maximum possible value of this error depends on which decision level the quantized value is closest to, since the corresponding quantization steps
may vary. The maximum absolute quantization error that may occur is known as the quantization error
of the quantizer. This error can be either positive or negative. Its magnitude is given by (
31):
Around each assigned decision level, an uncertainty band of width emerges. This is generally referred to as a source of quantization noise.
The relative quantization error
is defined as the ratio between the quantization error and the corresponding decision level. Its value is given by (
32):
This equation defines the relative quantization error where is the absolute quantization error, i.e., the difference between the true value and the assigned decision level, and represents the relative error that expresses how significant this difference is relative to the size of the value itself.
Relative error is essential in visualization—if there is the same absolute difference at a small power value, this can visually indicate a significant distortion. Therefore, for some applications, keeping the relative error constant is advantageous, which leads to a logarithmic or inverse distribution of quantization steps.
It is important to note that the magnitude of the quantization error depends on the corresponding quantization step . Similarly, the value of the relative quantization error depends on the ratio between the quantization step and the decision level.
In a computer’s memory, the spectrogram is stored as a binary-coded image. This opens the possibility of evaluating the spectrogram as an image using digital image processing techniques.
6. Experiments and Results
Spectrograms as images can be characterized by two parameters during analysis, i.e., brightness and spatial resolution.
The brightness resolution of multi-level static images is generally bits/pixel (number of bits/picture element), where > 1, while the number of brightness levels is . Thus, . Standard multi-level images have a brightness resolution of 8 bits/pixel, i.e., the number of brightness levels is 256. In special applications, < 8 can be used, e.g., in spectrograms. If a high quality of the reconstructed image is to be achieved, it is necessary to choose > 8. The influence of brightness resolution on image quality is shown in individual spectrograms from industrial processes generating signals.
For the number of bits = 1, this is the case of a binary image, and therefore also a spectrogram with the number of brightness levels = 2. If the number of bits = 8, this image has 256 brightness levels.
The spatial resolution of multi-level static images is most often given as the size of the matrix , where is the number of image elements in the horizontal direction and is the number of lines or image elements in the vertical direction. The standard resolution used in industrial practice is 256 × 256 or 512 × 512. However, images with a different resolution are also used in exceptional cases. Deterioration of image quality is already observable at a resolution of 128 × 128. At a resolution of 16 × 16, image recognition is lost.
One option for influencing or emphasizing individual amplitudes in the spectrogram is the choice of the brightness level. The choice of the brightness level lies in the width or range of individual brightness levels, as characterized in the previous section. The range of brightness levels can be chosen based on the transfer characteristics. This study presents the results of processing a spectrogram with linear and nonlinear transfer characteristics. Individual characteristics, spectrograms, and results with applications will be described in more detail in this study.
6.1. Linear and Nonlinear Characteristics of the Spectrogram of a Deterministic Signal
The time course of the deterministic signal is shown in
Figure 12a, and the original-color unprocessed spectrogram in
Figure 12b. Its color palette is based on the RGB color model. This spectrogram was converted to a spectrogram with 256 brightness levels of gray for the experiments. Brightness levels with equal range present a linear characteristic. The shape of such a linear characteristic is shown in
Figure 13a, and the resulting processed spectrogram in
Figure 13b. At the same time, it is possible to identify its dominant frequencies:
f = 158, 518, 1487, 2569, and 3850 Hz, with
= 8000 Hz.
Figure 12 contains a pair of graphs illustrating a signal and its spectral representation. This is a classic example of time–frequency signal analysis.
The time course of the signal (a) shows the course of the signal over time. The signal has a complicated waveform, which indicates complex system behavior (e.g., bioelectric or acoustic signal). Therefore, advanced analysis using spectral methods is needed, which results in spectrograms, for example.
The signal’s spectrogram (b) shows the time–frequency spectrum obtained by the short-time Fourier transform (STFT). The time range is 0–10 s, and the frequency range is 0–4000 Hz. The color scale represents the frequency components’ intensity (e.g., power or amplitude) over time. In the spectrogram, horizontal frequency bands at approximately 500 Hz, 1000 Hz, 1500 Hz, 2800 Hz, and 4000 Hz are clearly visible, indicating a deterministic non-harmonic signal with multiple dominant frequencies.
The spectrogram (b) allows us to understand when and which frequencies dominate, which cannot be deduced from the time course of the signal itself (a).
Figure 12a shows the signal’s time course, while
Figure 12b visualizes its spectral structure over time. Together, they provide a complete view of the signal dynamics. Such an analysis forms the basis for further steps such as quantization, feature extraction, machine learning, and diagnostics.
The figure contains two parts, labeled (a) and (b). Together, they visualize the quantization process of spectral quantities and the resulting spectral representation after processing.
Figure 13a shows a scheme for quantizing the original spectral values into discrete quantization levels. The line segment shows an ideal linear transformation between the unquantized and quantized spaces. Quantization reduces the number of values the spectrum can take, which is essential for processing (i.e., data compression, preparation for coding or classification) the spectrogram and images in general.
Figure 13b shows the signal’s spectral representation after quantization processing. This figure presents shades of gray in individual frequency bands (the lighter the shade, the higher the intensity). Several distinct frequency bands persist over time, clearly visible in the spectrogram. After quantization, subtle “band” structures can be observed, especially in the lower frequencies, resulting from reduced intensity levels (i.e., data discreteness).
Figure 13 demonstrates the chain of processing a spectral signal, where the original continuous spectrum is mapped to discrete classes. The result is a quantized spectrogram ready for compact storage or processing in machine learning systems. This is particularly suitable for fault detection in technical systems and pattern recognition in bioengineering or mechanical engineering.
For the initial experimental purposes of spectrogram processing, it was assumed that the linear characteristic did not change, but the brightness level, i.e., the number of individual colors, did. The spatial resolution did not change either. The results of this experiment are presented in the following figures (see
Figure 14a–f).
It is clear from
Figure 14a–c that the individual dominant frequencies from the spectrogram can be easily identified. They are pronounced. It is less clear which are the dominant frequencies in
Figure 14d,e. The sought frequencies are next to each other. Their recognition is more complicated. If two brightness levels are used (see
Figure 14f), the frequencies close to each other completely merge or overlap. Identification is not possible. However, isolated frequency components are also clearly visible in such a spectrogram. It can therefore be stated that there is a significant loss of information. Binary quantization significantly reduces the visibility of adjacent frequency components, resulting in a loss of detail in the spectrogram.
These images (see
Figure 14a–f) are spectrograms of the original image, where the number of brightness levels used to represent it is reduced. From image (a) to (f), one can see the image’s simplification, increasing the contrast between different areas while at the same time losing fine details.
Detailed analysis of individual images:
(a) Almost smooth brightness transition; high brightness levels. Delicate structures and details are preserved, and even weak transitions are visible. This image partially resembles the original (complete 256 levels).
(b) Subtle reduction in the number of brightness levels. The transitions are slightly coarser, but details are still well recognizable.
(c) Visible loss of fine transitions; flat areas of equal brightness appear. Textures are less pronounced; the image is more segmented.
(d) Quantization is pronounced. The image is divided into clearly separated levels or areas of different brightness. Fine details are almost suppressed, and only the main large structures remain.
(e) Very low number of brightness levels. The image looks like it is composed of several homogeneous areas. Delicate structures and small details are entirely lost.
(f) Binary image: Black and white areas with two brightness levels. The image is maximally simplified; only very coarse structures are preserved.
Overall, reducing the number of gray levels leads to a gradual loss of information, with the most pronounced degradation occurring below 64 levels. These results show the impact of quantization on spectrograms’ visual quality and information content.
With a nonlinear characteristic of brightness levels, the curve’s steepness changes with the change in
, which is related to the ranges of individual levels. For a steeper
= 3, the amplitudes in the spectrogram that have a higher frequency are emphasized. The brightness level range is different in that the range between individual intervals decreases.
Figure 15a,b shows the course of two options with different variants of
.
As illustrated in
Figure 12,
Figure 13 and
Figure 14, nonlinear quantization significantly enhances the visual discrimination of transient features, particularly in low-SNR regimes. For example, in
Figure 14d–f, linear quantization fails to separate closely spaced frequency components due to insufficient contrast in low-power regions. Conversely, the nonlinear characteristic with increased sensitivity in high-power zones (as shown in
Figure 15) improves the separation of these components, making transient events more visually prominent. This practical advantage underscores the usefulness of nonlinear gain allocation strategies in industrial diagnostics, where rapid anomaly detection is critical.
The presented results for the nonlinear spectrogram characteristic are for 256 brightness levels. This is because a spectrogram with these levels allows for more experimentation possibilities. The proposed nonlinear characteristic can also have the opposite shape (see
Figure 15a,b).
An interesting case occurs when the nonlinear quantization characteristic is inverted, emphasizing weak spectral components instead of strong ones. This inversion modifies the resolution profile so that low-power components are visualized with higher brightness precision, while high-power components are suppressed. Two possible configurations of inverted slope quantizers are illustrated in
Figure 16a,b.
Figure 17 presents four variants of multi-level images created by different processing methods applied to the original data. Image (a) represents the original image with medium contrast, with visible horizontal structures and fine periodic textures. The inverted version of image (b) increases the visibility of darker details on a light background, which increases the contrast of delicate structures. Image (c) represents an image with normalized brightness, where the balance of light and dark areas is achieved, thus preserving the fine texture at lower contrast. Finally, image (d) demonstrates significant thresholding, which emphasizes only the brightest horizontal bands while suppressing fine details in other parts of the image.
This series of images allows for a more comprehensive analysis of the structural properties of the processed data from different perspectives and supports a better visual interpretation of dominant phenomena.
In
Figure 17a–c, the frequencies and the different influence of the choice of the brightness level characteristic are sufficiently recognizable. With steeper characteristics (see
Figure 17d), the dominant frequencies are strongly visible. This spectrogram appears to be binary, but it is not. The influence of the proposed characteristic has reduced the insignificant brightness levels. In many industrial processes and subsequent processing of spectrograms, this is the primary and fundamental goal of processing the measured signal.
Figure 17a–d present four variants of multi-level images in which horizontal structures, i.e., frequencies, are prominent. Each image emphasizes different properties of the brightness levels. Individual analysis:
(a) Dominantly dark image with bright frequencies or frequency bands. Medium-to-high contrast between bands. Periodic elements are visible as small oscillations. The original image appears to retain much detail, with the dark intensity of the image elements dominating.
(b) Overall bright image compared to (b) with darker details retained. High contrast in darker areas. Fine structures are more apparent due to the higher background brightness. This is probably a negative (inverse image) of the original image (a).
(c) Brightness balanced image—medium shades of gray, without a strong dominance of white or black. Lower than in (a) or (b), the image is smoother. Fine details and periodic structures retained. This image may be the result of contrast normalization.
(d) Image with a dark background and prominent bright horizontal bands. Extreme brightness levels between the bands and the background. Dominant frequencies outside the bright bands are practically suppressed. This is an example where only the brightest areas of the image are highlighted.
These four images present different processing of the same or very similar data:
(a) and (b) appear to be inverse versions.
(c) represents contrast optimization for better analysis of fine structures.
(d) serves to extract the most prominent features, probably to analyze the dynamic change and width of the horizontal bands.
Another option that the research has addressed in the processing of spectrograms is the design of a nonlinear but partially linear brightness characteristic. The proposed characteristics are presented in
Figure 18a,b.
The resulting spectrograms are shown in
Figure 18a,b. The processed spectrograms clearly identify the dominant frequencies. It can be said that only the parts of the spectrogram where the sensitivity of brightness levels is higher are highlighted.
When analyzing spectrograms as a multi-level static image, it can be stated that the design of individual characteristics is essential (see
Figure 19a,b). Only some results of the used characteristics are given, in which the brightness levels differ significantly. These are linear and nonlinear characteristics. Spectrograms with individual characteristics of brightness levels were constructed based on measurements of signals from industrial practice and modeling in laboratory conditions. They were processed using a linear transfer characteristic with a constant difference over the entire range.
The following is a general description of images (a)–(d) in
Figure 19.
(a) Multi-level image—left side of
Figure 19:
This image is characterized by coarse quantization into several brightness levels (probably 4 to 8 bits).
Large areas of almost homogeneous intensity are clearly visible in the image, between which there are jumps—discontinuities caused by a limited number of levels.
Texture on the darker parts (especially in the upper and lower bands) indicates that fine details of the original image have been suppressed and “hidden” in a few discrete values.
Pronounced transitions between bands indicate dominant low-frequency components in the original image.
Fine details and noise are partially preserved, but significantly distorted by quantization.
(b) Multi-level image—right side of
Figure 19:
This image is processed differently to (a)—it appears as an “inverse” or “complementary” version of the same original image.
The brightness levels are distributed from bright to dark bands, unlike (a), where dark areas dominated.
Details in the bands are more emphasized, suggesting the use of a contrast enhancement or adaptive quantization method.
The structure and periodicity of the individual bands are preserved, but with a different dynamic range of brightness.
In the brighter areas (e.g., in the upper band), a finer structure is visible, suggesting less aggressive quantization than in (a).
Comparison (a) vs. (b):
(a) has strongly suppressed dynamic changes and is characterized by pronounced edges between brightness levels.
(b) shows better-preserved intensity variability and better visible fine details in brightness bands.
(b) is more suitable for further analysis of structures and textures, while (a) might be better suited for binary sorting or region segmentation.
Figures (a) and (b) are multi-level representations of the signal image after quantization transformation. Figure (a) shows an image with strong quantization, where discontinuities between individual intensity levels are clearly visible. Due to the reduction in the number of levels, fine details have been significantly suppressed and dominant band structures with low-frequency characteristics have been emphasized. In contrast, figure (b) applies a finer or adaptive quantization, which results in better preservation of structures and fine intensity variations within individual bands.
Both images show regular horizontal periodicity, with (b) revealing more clearly the textural details of the original signal. These results document the different behavior of quantization processes and their impact on the visual and structural integrity of multi-level images.
6.2. Linear and Nonlinear Characteristics of the Spectrogram of a Stochastic Signal
A spectrogram of a stochastic signal was processed similarly, with a sampling frequency of
= 22,050 Hz.
Figure 20a shows a segment of the time course of the stochastic signal, and
Figure 20b shows its entire original spectrogram. It should be noted that such a spectrogram is more complex than a spectrogram from a deterministic signal. It often lacks distinct dominant frequencies or frequency bands. Therefore, its processing often does not yield the expected results. However, designing linear and nonlinear brightness characteristics can make them more efficient and increase information content.
The results of the experiment with the spectrogram of the stochastic signal are presented in
Figure 21a–f.
In
Figure 21a–d, some dominant frequencies and frequency bands are clearly visible. A linear characteristic was used. Despite the fact that it is a spectrogram of a stochastic signal, it also has several significant frequencies, although in some time periods, they are lost. Information about the process or its state cannot be determined from
Figure 21e,f. It is true that if the process is strongly stochastic, the brightness level must be at its maximum value.
The following is a scientific analysis of the multi-level images (a)–(f) in
Figure 21.
These images show multi-level spectrograms or time–frequency representations of signals, with partially distinct harmonic structures and band components. Several frequency components (labeled , , , ) and gradual degradations are shown here, similar to the previous set of images.
Detailed analysis of the individual images:
(a) The four dominant frequency components (, , , and ), are clearly marked with red arrows. The intensity of the individual frequencies is visible against a slightly noisy background. The frequencies and band are slightly interrupted, indicating a change in frequency over time ⟶ frequency drift or chirp effect. The overall background is noisy, but the bands stand out.
(b) The structure remains preserved: frequency bands are visible without enhancement. Slightly increased visual noise compared to (a).
(c) Again, the frequency components are highlighted. Higher noise level compared to (a) and (b). The frequency components are still identifiable, but their contrast with the background is weaker.
(d) Strong noise background; more challenging to identify bands; characteristic of a stochastic signal. Clear contours between frequencies and noise are lost.
(e) Significant change in visual quality. Only two main frequencies and a frequency band are visible. The spectrogram is extremely quantized, which is caused by the low number of bits representing amplitudes.
(f) It presents a completely binary image with white and black picture elements. There is almost complete loss of visual information about dominant frequencies and bands. The spectrogram has a noise character.
As the number of brightness levels decreases, delicate structures are lost, and only significant parts of the spectrogram remain. Such multi-level processing is often used in data compression, image segmentation, and preprocessing for extracting basic shapes. The detail of the information visualization decreases rapidly, especially when going below 64 brightness levels.
A similar conclusion applies to the nonlinear characteristic: In
Figure 22a,b, dynamic changes in the process are difficult to recognize. The situation is different in
Figure 22c,d, where the course of the process and its character can be seen.
Figure 22a–d show the processing of spectral data with clearly visible harmonic components, labeled
,
,
, and
(in figures (a) and (d) only).
(a) Original or basic spectral image with harmonic frequencies highlighted. Red arrows indicate a progressive increase in frequencies to , indicating dynamic changes in time and frequency of the harmonic series of the signal. The overall contrast is high, allowing for clear identification of individual frequency bands.
(b) Modified image variant (probably normalization or filtering to reduce noise). Reduced contrast and slightly higher background homogeneity. Harmonic structures are still present, but less pronounced than in (a).
(c) Further image modification, probably applying filtering with higher background suppression. Reducing the amplitudes of weaker components and highlighting the structure of the main harmonic bands. Acts as an intermediate step between the original and binary representation.
(d) Extracted main harmonic components on a dark background. The image highlights the frequency structures to under conditions of minimal noise. This type of processing is often used to detect fundamental and higher harmonics in signal processing.
Figures (a) to (d) present the different stages of spectral image processing, with the aim of successively achieving the following:
Highlighting relevant frequency structures (harmonics).
Suppressing noise and reducing unwanted artifacts.
Enabling unambiguous identification of key frequency components.
Such processing is commonly used in signal processing, especially in the analysis of modulated signals, vibrations, acoustic signals, and technical diagnostics.
The process properties are also difficult to recognize in the case of a nonlinear, partially linear characteristic (see
Figure 23a,b). Here, some really basic parts of the spectrogram can be identified. However, the frequencies or frequency band must be sufficiently strong in the basic properties of the signal.
The following is an analysis of the multi-level images (a)–(b) in
Figure 23.
Image (a):
This multi-level image shows a spectrogram where two dominant frequency components, labeled and , are prominently marked.
The red arrows show their gradual increase in frequency over time, which indicates the dynamic nature of the signal (for example, accelerating motion or changing environmental conditions).
Another red label indicates the bandwidth of a certain frequency range, which is prominent and can be considered as the fundamental carrier component of the system or device.
The background contains noise, but the prominent frequency bands remain clearly visible, which indicates a suitable extraction of the main components.
The spectrogram is probably created by a method that captures time–frequency changes well even with high noise levels (e.g., using thresholding or localized transformation methods).
Image (b):
This image is a variant or processed version of image (a), where the noise level is clearly reduced and only the dominant frequency components are highlighted.
The and frequency bands remain present, but background noise details are suppressed, which improves the interpretability of the data.
Traces of other weaker harmonics or sidebands are also visible, indicating a more complex signal structure (e.g., modulations or frequency multiplications).
The processing method includes contrast enhancement, low-amplitude filtering, or image binarization.
This type of analysis is suitable for applications such as device diagnostics, radar processing, or ultrasonic measurements, where it is important to detect dominant frequency changes over time.
6.3. Linear and Nonlinear Characteristics of the Spectrogram of a Pulsed Signal
Another type of signal whose spectrogram has been the subject of research is a pulsed signal with a stochastic component and a sampling frequency.
Figure 24a is its time course, and
Figure 24b is a color spectrogram. Such a type of signal is often found in industrial practice. It represents a signal periodically repeated over a short period. The results of the experiment with the spectrogram of a pulsed signal are presented in
Figure 25a–f.
In
Figure 25a–e, individual pulses and their periodicity are visible. The conclusion concerning
Figure 25f is the same as in the previous experiments with spectrograms from deterministic and stochastic signals.
The following is an analysis of the multi-level images (a)–(f) in
Figure 25.
Images (a) to (f) are multi-level spectral visualizations of repetitive signal patterns with distinct periods in the time–frequency domain. Each image shows structures that indicate the regular repetition of pulsed and periodic events.
In some images, regions of interest are marked with red rectangles that highlight specific signal segments.
The descriptions of the individual images are as follows:
This image shows a series of regular pulses with characteristic structures. The red rectangle highlights a specific group of pulses in the middle part of the spectrum. Vertically oriented structures indicate repeated-frequency pulses or sharp changes in time: pulses with high-frequency content.
This allows for a clearer assessment of the spectrum’s global structure. The pulses’ regular periodicity is visible, with approximately equal spacing between the structures.
This image is very similar to (a), with the red rectangle highlighting the same area again. It differs in the detailed processing, i.e., slightly different time–frequency analysis parameters.
Image (d) corresponds to (c). It shows slightly higher noise, but the main pulse structures remain well defined. The visible vertical lines indicate that the signal contains stable, repetitive, short-frequency pulses.
This image has noticeably coarser quantization of the brightness level and reduced resolution. There is a loss of fine background details—the spectrum appears blocky. The pulses are still distinguishable, but the background is less homogeneous, with pronounced clusters of bright and dark areas.
Image (f) represents a highly binary version of the spectral image. Sharp contrasts dominate: there are black and white areas without intermediate shades. The vertical pulses are still distinct, but the background is filled with random points resembling noise. This image simulates signal degradation and pulse detection.
The series of images (a)–(d) captures regularly repeating frequency pulses in the time–frequency domain. Image (e) illustrates the effects of reduced quantization, while image (f) presents a binary version that emphasizes the main pulses at the expense of the lost fine background structure. The overall periodicity of the pulses indicates that the analyzed signal consists of short, frequency-wide events repeating with almost equal time intervals.
The following is an analysis of the images (a)–(d) in
Figure 26.
All four images show periodic structures (probably pulsed or pulsating phenomena) with strong vertical lines (indicating a repeating frequency spectrum).
The red highlighting in (c) and (d) highlights the specific area where these repeating pulses are analyzed in more detail (perhaps to extract finer details or for localized statistical analysis).
The sequential processing (from (a) to (d)) improves the visibility of fine frequency features, especially in the weaker parts of the signal.
Multi-level display of spectrograms showing the time–frequency structure of a periodic signal:
(a) Original spectrogram showing prominent repeating transitions.
(b) Modified spectrogram with improved contrast of these transition structures.
(c) Another version with the middle components highlighted, where a red rectangle marks the selected region of interest for more detailed analysis.
(d) A high-contrast spectrogram that highlights fine structures within the highlighted region, making it easier to identify weaker time–frequency features.
When the nonlinear characteristic is used for the impulse signal’s spectrogram, the experiment’s results are pretty satisfactory. It can be stated that the impulse signal of the process is observable on all spectrograms (see
Figure 26a–d). Similar conclusions were reached with the other spectrogram processing (see
Figure 27a,b). It can be said from the visualization of the signal spectrogram, with the above-described nonlinear characteristic suppressed, the low-power components of the signal (see
Figure 27a) and high-power components of the signal (see
Figure 27b) were suppressed.
Scientific analysis of images (a)–(b) in
Figure 27:
The multi-level image shown is the result of quantizing the spectrogram. It shows a comparison of two visualization techniques labeled (a) and (b), each highlighting a specific area of interest with diagnostically significant features.
Image (a)—multi-level visualization:
Uses a wider palette of grayscale shades (probably more quantization levels).
Preserves relatively detailed contrast between individual signal components.
From the red box, it can be concluded that the captured shapes have higher visual sharpness and dynamics, which makes it easier to follow time–frequency changes (e.g., pulses).
Image (b)—simplified (binary or coarser quantization):
Contains significantly fewer shades, indicating a lower bit depth or binarization.
Contrast is increased, which can help with automatic processing (e.g., segmentation or shape recognition), but the amount of detail for visual inspection is reduced.
The structures in the red box are more clearly separated from the background, but less smoothly transitioned than in (a).
The visualization in (a) allows for the preservation of a rich level of detail and smooth transitions between intensities, which is advantageous for expert visual analysis. It allows for the observation of the morphology of the impulses, as well as slight variations in the time–frequency behavior of the signal, which can be critical in assessing the severity of the fault. On the contrary, image (b) provides higher contrast between significant and insignificant areas, making it more suitable for automated evaluation or segmentation using thresholding algorithms.
From the point of view of practical application in technical diagnostics, it is advisable to choose the visualization method according to the purpose, whether it is manual interpretation by an experienced diagnostician or input for machine learning algorithms. In the optimal case, visualizations can be part of a hybrid system. Multi-level visualization supports model learning and also serves as a visual feedback tool.
In the previous text of this study, spectrograms were analyzed with different brightness levels characterized by a specific brightness characteristic. The characteristics differed in the width of the brightness levels. Based on the experiments, it was proven that there is a significant difference between linear and nonlinear characteristics. They were manifested when displaying individual spectrograms. The spectrograms were analyzed in detail by changing the brightness levels and increasing the range of their classes.
Reducing the number of brightness levels allows us to follow in more detail the development of spectrograms from the processes that generate the accompanying signal and represent an information source. The previous figures show the gradual reduction in the number of brightness levels (from 256, 128, 64, 32, 16, 8, 4 to 2) and the resulting spectrograms.
A linear characteristic, where the brightness levels are distributed evenly over the entire range, was used in the experiments. This characteristic is applied in the visualization and analysis of spectrograms. Its advantage is the effective identification of frequency components. A minor disadvantage is that in power spectra with a narrow range of components, many brightness levels are unused, which reduces the information yield of the overall spectrogram display.
The uneven distribution of brightness levels on the decision and quantization axes applies to the nonlinear characteristic. Appropriately set differences between the decision levels achieve a shape of the transfer characteristic that ensures sufficiently sensitive visualization of the spectrogram of a specific type of signal.
From the individual spectrograms, it is clear that the nonlinear characteristic of brightness levels emphasizes higher amplitudes with their frequencies than the linear characteristic. With the nonlinear characteristic of brightness levels, it is easier to distinguish dominant frequencies in the spectrogram itself. Lower amplitudes, with their frequencies, are suppressed. With the linear characteristic of brightness levels, it is more challenging to recognize dominant frequencies.
For experimental analysis, signals were selected to show differences between spectrograms to the greatest extent possible.
Visualizing the spectrogram, its entropy changes in principle because the brightness and spatial resolution used play a great role in this regard. However, the spatial resolution has not changed.
The entropy of the visualized spectrogram can then be smaller than the entropy of its original due to the process of quantizing the power in the converter using quantization levels. The number of quantization levels corresponds to the brightness resolution used.
For each type of signal spectrogram, the entropy was calculated for different numbers of colors (see
Table 1,
Table 2 and
Table 3).
The highest information content is in spectrograms with 256 linear brightness levels. The minimum entropy is when the number of levels is 2. This represents a logical confirmation of theoretical findings in experiments. It should be said that when the number of brightness levels decreases, the entropy also decreases in this experiment (see
Table 1).
A similar conclusion can be drawn for nonlinear brightness levels as for linear ones, but the decrease in entropy is more moderate (see
Table 2). The information content for nonlinear and partially linear brightness levels can be quasi-constant. This part of the experiments suggests that future research in this area is needed (see
Table 3).
In general, it can be stated that for the spectrogram of a deterministic signal, the entropy decreases with a decreasing number of colors. The highest value is at 256 colors, which indicates the most significant amount of information or resolution in spectrograms.
For the spectrogram of a stochastic signal, the entropy is relatively stable between 64 and 8 colors, which indicates a uniform distribution of information. At the extremes (256 and 2 colors), the entropy decreases.
In the spectrogram of a pulsed signal, there is a sharp decrease in entropy when the number of colors is reduced. This signal is very sensitive to a decrease in resolution—the information degrades quickly.
Figure 28a–c show the dependence of entropy on the number of brightness levels for three types of signals.
In all three experiments, the deterministic signal’s entropy decreased significantly with a decreasing number of brightness levels.
The entropy of the stochastic signal remains relatively stable in the middle range of the number of brightness levels in all three experiments.
For the pulsed signal, the entropy decreases sharply, which shows a high dependence on the brightness resolution.
The following can be concluded from the experiments:
The deterministic signal has the highest entropy at the highest resolution, which is expected—more detail in the spectrum means more information.
The stochastic signal shows the most stable entropy, probably due to its uniform distribution.
The impulse signal has the lowest entropy, which decreases quickly—its information is significantly dependent on a high resolution.
Based on entropy, the spectrograms were processed through histograms. This study presents only significant results from the histograms.
The spectrogram histogram expresses graphically the frequency of occurrences of individual brightness levels within all image elements. From the histogram, some properties of the spectrogram can be determined, e.g., contrast; dynamic range of its brightness levels; and modality, i.e., the number of local maxima of the histogram.
The initial histograms are based on 256 brightness levels for all spectrograms and all three experimental brightness levels. The histograms of the deterministic signal spectrograms are shown in
Figure 29.
The distribution is partially asymmetric, with an increased concentration of brightness levels between approximately 50 and 150. There is a Gaussian-distributed maximum in the middle part of the spectrum. The many low-frequency brightness levels on the right indicate a wide range of values; the center is recognizable (see
Figure 29a).
The very asymmetric shape indicates an extremely high concentration of dark levels (0–30), which drops sharply to the right. This indicates that most of the image is dark, with a few brighter points. The histogram has few bright image elements to the right—there is a low probability of higher brightness levels (see
Figure 29b).
The histogram is multi-peaked with two to three dominant maxima, around brightness levels 50, 100, and 130. This indicates a periodic character, where some values occur very often, others rarely. This is a histogram of a pulsed signal with repeating intensities (see
Figure 29c).
Histograms of the stochastic signal spectrograms are shown in
Figure 30.
The range of brightness levels is from 0 to 255, but the main concentration is in the interval 100–180. The dominant maximum is a brightness level around 140–150. A separate, narrow peak is in the low-brightness-level region around 30. The asymmetric distribution, rather than a skewed rise and a slower fall, resembles a slightly shifted Gaussian distribution. The spectrogram is overall brighter but contains local dark areas (see
Figure 30a).
The largest concentration of brightness levels is between 0 and 80. The dominant maximum is around 20–30, smoothly decreasing towards higher levels. The shape is a classic exponential decay, with a very steep start and a long decay. The spectrogram is dominantly dark, with few bright spots, which is typical for spectrograms with a dark background or low contrast (see
Figure 30b).
A high concentration of pixels around 60 indicates a very narrow peak. An extremely narrow distribution indicates that very many pixels have the same value, and other brightness levels are minimally represented. This histogram corresponds to very low entropy. The histogram shows minimal detail in the spectrogram and only a few brightness values (see
Figure 30c).
Histograms of the impulse signal spectrograms are shown in
Figure 31.
The range of brightness levels is between 100 and 180. It presents an asymmetric Gaussian distribution, with a peak around 140. A single narrow peak at a low value of 30 is interesting. The image has a dominantly bright background and some dark parts (see
Figure 31a).
Most values are concentrated between 0 and 80, with a gradual decrease towards 255. This is an exponential decrease, typical for images with a dark background. This spectrogram is dominantly dark, with residual bright noise (see
Figure 31b), and represents the dominance of one brightness level. This histogram corresponds to very low variability, which indicates a spectrogram with a dominant background and few changes, e.g., binary (see
Figure 31c).
Common features of histograms are that they are not flat and have significant steepness. Histograms also contain some local extreme frequencies, whether local maxima or minima. These contribute to more detailed processing and interpretation of histograms.
The experimental evaluation clearly demonstrates that the use of nonlinear quantizers introduces intentional asymmetry into the visual representation of signal spectrograms. While linear quantization ensures a symmetric and uniform treatment of all amplitude levels, it often fails to highlight subtle variations in the signal. In contrast, nonlinear and piecewise-linear quantizers modify the underlying symmetry of the brightness mapping, allowing enhanced contrast in selected regions of the power spectrum. This symmetry manipulation enables a more informative and visually distinct presentation of industrial signal features, particularly in cases where conventional methods offer limited insight.
6.4. Comparison with Other Quantization Techniques
In industrial signal and image processing, various quantization techniques are commonly used. Beyond the nonlinear quantization approach presented in this paper, several additional techniques are worth highlighting:
Adaptive quantization dynamically adjusts quantization steps based on signal characteristics. It improves representation quality under varying signal conditions.
Dead-zone quantization introduces an extended zero region around the origin, compressing small signal components to zero. This technique enhances compression efficiency by suppressing negligible components.
Nonuniform scalar quantization (NSQ) employs optimized thresholds and levels based on signal statistics, often using iterative algorithms like Lloyd-Max to minimize quantization error.
Dithering adds a small amount of noise prior to quantization to randomize quantization error and reduce visible artifacts in low-bit-depth representations.
To validate the performance of the proposed nonlinear quantization approach, we compared it with these alternative techniques using spectrograms derived from deterministic, stochastic, and pulsed signals. Selected results are shown in
Figure 32,
Figure 33,
Figure 34 and
Figure 35.
Nonlinear and adaptive quantization are techniques used to reduce the amount of data representing a signal or image. Although both use different approaches, their goal is to optimize the signal’s representation while maintaining the highest possible quality.
Nonlinear quantization:
Nonlinear quantization works with a non-uniform distribution of quantization levels. It is designed to better suit signals whose amplitude is not distributed uniformly, typically acoustic or vibration signals, where smaller values have a higher probability of occurring. Quantization levels are densely placed near zero and sparsely at the edge values.
This quantization is often implemented using logarithmic functions, which are standard in PCM encoding. Nonlinear quantization reduces the relative quantization error for small signal values, thereby improving the subjective perception of quality, especially in audio applications.
This method’s advantage is its simplicity and independence from a particular signal—it uses a fixed transformation without the need for data preprocessing. On the other hand, its disadvantage is that it may not be optimal for all types of data, especially those without a logarithmic amplitude distribution [
37].
Adaptive quantization:
Adaptive quantization, on the other hand, adapts to a specific input signal. The quantization levels are distributed according to the signal’s statistical properties, such as the histogram or local contrast. This approach allows for more efficient use of the available quantization levels and reduces the mean square error.
Adaptive quantization is often used in image and video compression, where the dynamic range of brightness or color can vary significantly in different image areas. In this case, the quantizer is updated based on local analysis, achieving higher output quality with the same number of bits.
Its main advantage is higher efficiency and quality, especially for signals with uneven distribution. The disadvantages are higher computational complexity, the need for signal analysis, and greater variability of results, which can complicate decoding.
Both techniques have their place in various applications and are often used with lossy compression algorithms [
38,
39].
Dead-zone quantization:
Dead-zone quantization is a special type of quantization in which the central interval of quantization levels—the zone around zero—is expanded. Small values that would otherwise be quantized to one of the lowest levels are rounded to zero. This principle is very effective in situations where it is necessary to suppress noise or negligible signal components that do not significantly affect the quality of the output.
Dead-zone quantization is again often used in compression algorithms, where it is applied after transformations (e.g., DCT or wavelet transform). In these cases, the number of coefficients tends to be small and close to zero, making them ideal candidates for suppression via the dead-zone mechanism.
While nonlinear quantization optimizes quantization accuracy in areas with a high probability of occurrence, dead-zone quantization purposefully ignores small values at the expense of better compression and noise reduction. Nonlinear quantization is more oriented towards subjective quality and fidelity, while dead-zone quantization is a tool for effectively suppressing and thresholding insignificant details.
A non-uniform scalar quantizer (NSQ) is a specific type of quantizer that designs quantization levels and thresholds based on a known or estimated probability distribution of the signal. Optimization algorithms, such as the Lloyd or Lloyd-Max algorithms, are often used in NSQs. These algorithms iteratively search for a distribution of quantization levels that minimizes the mean square error (MSE) between the original and quantized signals. NSQs thus represent a practical tool for creating an efficient nonlinear quantizer tailored to a specific type of signal.
NSQs allow for adaptive or learned design of quantization thresholds, thereby achieving higher accuracy, especially for signals with non-standard amplitude distributions [
40].
Dithering:
Dithering is a technique that adds a small amount of random noise to a signal before quantization to eliminate systematic errors that can lead to unwanted visual artifacts. By adding dithering, the quantization error becomes random (noise) and thus subjectively less disturbing. This method does not change the quantization levels themselves, but improves the quality of the output, especially at low bit depths or for signals with fine details. Dithering is often used in audio and video applications to suppress the “staircase” or banding effect.
From a theoretical point of view, nonlinear quantization optimizes the distribution of quantization levels to minimize the average quantization error concerning the probability distribution of the signal. At the same time, dithering does not adjust the quantization levels. Still, it improves the subjective quality of the resulting signal by changing the nature of the error from deterministic to random.
Use of the presented methods is evidently determined by specific signals or images. In contrast, linear and nonlinear quantizers are universal and can be used for various types of signals, whether deterministic, stochastic, or impulse, as presented in this manuscript.
To verify the nonlinear quantization method, the spectrograms were processed by other available techniques. Some of the results are shown in
Figure 32. The experimental results show that the nonlinear and adaptive quantization methods can effectively identify the dominant frequency components of the spectrogram, although the resulting adaptive quantization spectrograms have more values from the darker part of the brightness level.
The nonlinear quantization method was also compared with the dead-zone quantization method. Some of the results are shown in
Figure 33. The experimental results show that both methods can confirm the dominant frequency components of the spectrogram. Again, the spectrograms resulting from dead-zone quantization have a darker character. However, this does not reduce the effectiveness of the method.
The nonlinear quantization method was compared with the non-uniform scalar quantizer technique.
Figure 34 shows selected results of this comparison. The experiments show that both methods can identify the dominant frequency components in the spectrogram. The spectrograms obtained with the non-uniform scalar quantizer have an overall brighter character.
Selected results are shown in
Figure 35. The experimental results indicate that the dithering method can also reliably capture the dominant frequency components of the spectrogram, even though it adds noise to the spectrogram, which is unsuitable for stochastic signals. Spectrograms obtained by the dithering method show a higher proportion of brighter brightness values.
A metric was also used to address the effectiveness of the methods. The PSNR metric, expressed in decibels (dB), is often used to assess the quality of processed images and signals compared to the original. A higher PSNR value usually means higher quality (i.e., it has lower error). It is expressed by Equation (
57):
where MAX is the maximum possible signal value (e.g., 255 for an 8-bit image) and MSE is the mean squared error, i.e., the average squared error between the original and the reconstruction or quantization.
The MSE is calculated according to the following equation:
where
is the original image or signal and
is the processed image.
The PSNR (peak signal-to-noise ratio) value is interpreted as a measure of the quality of the reconstructed signal compared to the original. If the PSNR is higher than 40 dB, the result has excellent quality, and the difference from the original is practically indistinguishable. Values in the 30 to 40 dB range represent good quality, with minimal loss of detail. If the PSNR drops to 20 to 30 dB, there is a noticeable loss of quality. Values below 20 dB indicate poor quality and significant distortion of the reconstructed signal.
Table 4 shows the PSNR values for three types of signals processed by five different quantization techniques. The PSNR values were evaluated to measure the quality of the spectrogram reconstruction after quantization.
The results of nonlinear quantization indicate that it preserves frequency components well for deterministic and stochastic signals. However, the quality of cavitation decreases for impulse signals, where the PSNR falls below the 20 dB threshold.
Adaptive quantization achieves significantly lower PSNR values in all cases, which indicates a low objective quality of reconstruction. The results may be affected by the specific setup or type of signal.
The dead-zone quantization technique exhibits consistently high PSNR values and achieves high-quality reconstruction of signals with a wide dynamic range. It is especially suitable for cases with dominant low-frequency components.
According to the experiments in this manuscript, the non-uniform scalar quantizer is the most efficient method in terms of the PSNR. The results confirm that a nonlinear distribution of quantization levels is advantageous for preserving the quality of various types of signals.
Dithering has lower PSNR values, but its benefit lies more in the subjective reduction of visual artifacts than in an objective quality metric. It is essential to evaluate it also by perceptual criteria.
Each method has advantages depending on the type of processed signal and the goal of the analysis. Therefore, the choice of quantization technique should reflect specific requirements for visual impression, practical use, or compression.
7. Discussion
The visualization of spectrograms using various types of quantizers reveals important insights into the capabilities and limitations of both traditional and advanced signal processing methods. The experiments demonstrated that linear quantization with uniformly distributed brightness levels enables easy identification of dominant frequencies in deterministic signals. However, for signals with lower amplitudes or in the presence of noise (e.g., stochastic signals), this approach may result in a loss of relevant information.
Nonlinear quantization, particularly in the form of piecewise-linear transfer characteristics, proved effective in enhancing specific frequency regions that are crucial for technical diagnostics and process monitoring. These methods were especially valuable in the analysis of impulse signals, where conventional quantization failed to distinguish short-term high-power components sufficiently. It was also confirmed that an appropriately designed nonlinear characteristic can suppress irrelevant portions of the signal and enhance the contrast of the spectrogram image.
From an information theory perspective, spectrograms generated using nonlinear quantizers demonstrated higher information density in regions of interest, as quantified by entropy. This enables more accurate process evaluation. However, it must be acknowledged that increasing sensitivity in one part of the spectrum necessarily reduces it elsewhere—therefore, the choice of quantization strategy should depend on the nature of the analyzed signal and the specific objectives of the analysis.
The presented results highlight the importance of considering not only the mathematical and technical aspects of quantization but also its impact on the interpretation of output data in image form. Spectrograms are increasingly being viewed not just as signal representations but as image objects that can be processed using computer vision techniques. This perspective opens new opportunities for applications in industrial diagnostics, predictive maintenance, and automated decision making.
Standard spectrograms in their original form often suffer from limited dynamic range, leading to the suppression of weaker signal components that are crucial for early fault detection.
As demonstrated by the analyzed images, multi-level processing enables selective highlighting of individual frequency bands and improves the contrast between fundamental and harmonic signal components. This facilitates more straightforward visual interpretation and faster identification of essential spectrogram changes that might be overlooked.
The practical implications of this visualization include the following:
Early fault detection: Enhanced display of weak signals allows for capturing initial damage stages, such as microcracks, rotating part imbalances, or early signs of bearing defects.
Reduced operator expertise requirements: Improved contrast and more transparent spectrogram structure lower the subjective complexity of analysis, minimizing reliance on highly experienced diagnosticians.
Faster decision making: Enhanced visualization supports quicker identification of potential issues without the need for deep numerical analysis of each spectrum.
Automation: Better-quality input spectrograms create the foundation for more effective deployment of automated diagnostic systems based on machine learning or computer vision.
The results show that appropriate visualization adjustments can reveal structures that are either barely visible or completely hidden in noise on conventional spectrogram displays. This is particularly important in applications where fault-induced signal changes have a low amplitude relative to the main operational load.
In conclusion, the proposed multi-level approach to spectrogram visualization directly enhances the reliability of technical diagnostics. It contributes to extending the lifespan of machinery and equipment through earlier and more accurate damage detection.
An important aspect revealed by this study is the implicit role of symmetry in the design and interpretation of spectrograms. Linear quantizers inherently assume a symmetric treatment of the signal’s dynamic range, allocating equal resolution across all power levels. This results in a uniform representation that may overlook subtle but diagnostically important variations. In contrast, nonlinear quantizers purposefully disrupt this symmetry, prioritizing certain frequency or power regions by allocating finer resolution where needed. This selective asymmetry allows for targeted enhancement of spectral features that correlate with faults, transitions, or irregularities in the process. Therefore, the act of symmetry manipulation—either preserving or breaking it—becomes a key tool in tailoring the visual output to the specific needs of technical diagnostics and process monitoring.
8. Conclusions
In the context of the journal’s focus, it is worth emphasizing that the entire methodology is grounded in the concept of symmetry and its manipulation. Linear quantizers preserve the natural symmetry of a signal’s amplitude or frequency distribution, whereas nonlinear quantizers introduce purposeful asymmetry to enhance interpretability. This manipulation of symmetry—breaking it in controlled ways—proves beneficial for highlighting signal features that would otherwise remain hidden. Hence, this work offers not only a signal processing technique but also a perspective on how exploiting and transforming symmetry can lead to more effective diagnostic tools.
This paper focused on the visualization of technical signal spectrograms using nonlinear quantizers. The foundation of the study was the design of quantization algorithms that allow adjustment of the sensitivity of power spectrum visualization through modifications of the quantizer’s transfer function. A comparison between linear and nonlinear approaches showed that nonlinear quantization enables more pronounced visualization of dominant frequency components, facilitating more effective analysis even in signals with low amplitude or high spectral noise.
The experimental section included analyses of spectrograms derived from deterministic, stochastic, and impulse-type signals. The results confirmed the hypothesis that a well-designed nonlinear transfer function can enhance the information content of a spectrogram while suppressing irrelevant signal components. Special attention was also given to evaluating spectrogram entropy as a quantitative metric of visual information density.
The findings demonstrated that nonlinear approaches—including piecewise-linear characteristics—allow selective enhancement of specific regions of the power spectrum. This could have a significant impact in industrial applications such as machine diagnostics, process monitoring, and fault detection.
While the proposed approach demonstrates clear benefits in spectrogram-based visualization of industrial signals, several limitations should be acknowledged. First, the design of nonlinear quantization profiles currently relies on manual selection of parameters such as the slope and resolution function, which may require expert knowledge and tuning for different types of signals. Furthermore, the evaluation of spectrogram quality is primarily based on entropy measures and visual inspection, lacking a universally accepted quantitative metric for interpretability or diagnostic accuracy.
Future work will focus on the automated optimization of quantizer parameters using data-driven techniques, including heuristic algorithms and machine learning. Additionally, the integration of this visualization framework with real-time monitoring systems and intelligent diagnostic platforms will be explored. Benchmarking against other existing visualization or feature extraction methods will also be conducted to assess comparative performance in various industrial scenarios. These extensions will support broader adoption of the method in practical applications.
The proposed spectrogram visualization methodology, using nonlinear and adaptive quantization, has potential societal and managerial benefits. From a managerial perspective, enhanced interpretability of signal features allows maintenance engineers and process supervisors to detect anomalies earlier and with greater confidence. This can lead to improved predictive maintenance strategies, reduced unplanned downtime, and overall cost savings in industrial operations.
On the societal level, more effective monitoring of technical systems contributes to safer and more reliable industrial infrastructure. In sectors such as energy, manufacturing, and transportation, early detection of faults or instabilities not only prevents equipment damage but also minimizes environmental risks and enhances worker safety. Thus, improved signal visualization supports both operational efficiency and broader goals of sustainable and responsible industry.
In conclusion, the proposed spectrogram visualization method represents a valuable contribution to the processing of technical signals, particularly in terms of interpreting them as image-based data. The suggested quantization strategies open new possibilities for intelligent evaluation of dynamic processes.