A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application

Ramírez-Mendoza, Abigail María Elena; Yu, Wen; Li, Xiaoou

doi:10.3390/math11112525

Open AccessArticle

A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application

by

Abigail María Elena Ramírez-Mendoza

¹,

Wen Yu

^2,*

and

Xiaoou Li

³

¹

Department of Autonomous Air and Underwater Navigation Systems, French Mexican Laboratory of Informatics and Automatic Control, Mixed International Unit (LAFMIA UMI), Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN), National Council of Science and Technology (CONACYT), Av. Instituto Politécnico Nacional 2508, San Pedro Zacatenco, Mexico City CP 07360, Mexico

²

Department of Automatic Control, CINVESTAV-IPN, Av. Instituto Politécnico Nacional 2508, San Pedro Zacatenco, Mexico City CP 07360, Mexico

³

Computer Department, CINVESTAV-IPN, Av. Instituto Politécnico Nacional 2508, San Pedro Zacatenco, Mexico City CP 07360, Mexico

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(11), 2525; https://doi.org/10.3390/math11112525

Submission received: 6 April 2023 / Revised: 9 May 2023 / Accepted: 11 May 2023 / Published: 31 May 2023

(This article belongs to the Special Issue Mathematical and Computational Neuroscience)

Download

Browse Figures

Versions Notes

Abstract

:

This paper introduces a new spike activation function (SPKAF) or spike membership function for fuzzy adaptive neurons (FAN), developed for decoding spatiotemporal information with spikes, optimizing digital signal processing. A solution with the adaptive network-based fuzzy inference system (ANFIS) method is proposed and compared with that of the FAN-SPKAF model, obtaining very precise simulation results. Stability analysis of systems models is presented. An application to voice recognition using solfeggio syllables in Spanish is performed experimentally, comparing the methods of FAN-step activation function (STEPAF)-SPKAF, Augmented Spiking Neuron Model, and Augmented FAN-STEPAF-SPKAF, achieving very good results.

Keywords:

fuzzy inference systems; fuzzy system models; neuro-fuzzy systems; learning; fuzzy adaptive neuron; membership function; activation function; spiking neuron; decoding; pattern recognition; syllable-based speech recognition

MSC:

37M10

1. Introduction

Neural biological processes of communicating information between cortical neurons are performed by spikes, encoding and decoding data transmission for speech recognition and images. The processing of neural information from real-world stimuli and the classification of spike patterns is one of the main objectives for the area of cell physiology. This positions the recognition of the spike pattern sequences for the encoding and decoding of neural information; in this context, a fuzzy system is proposed for the decoding or recognition of spike patterns that, upon exceeding a threshold, activate a neuronal response.

Biological-inspired models of neurons have been proposed, including the models by McCulloch and Pitts, Hodgkin and Huxley [1,2], Gerstner and Kistler, Izhikevich, and Ramírez-Mendoza [3,4,5]. The theory of fuzzy logic such as that by Zadeh [6] has been very successful for decades, and models of fuzzy neurons, such as that from Gupta [7,8,9], have emerged, some of them with a spike response such as those by Ramírez-Mendoza et al. and Zhang [10,11,12].

Related Works

Some of the main characteristics of the human brain are learning and memory and recognizing voices, images, and objects. Classic and fuzzy neuron models have been configured in neural networks or systems for different applications, based on spiking neurons, some with axonal delays, which are mentioned below.

In [13], the authors present the firing neurons to extract perceptual information from a time series of images with firing patterns of the spiking neurons based on competitive learning, a method for the control of visual perception by predicting an associated robot. Spike patterns generated in a time domain based on temporal rules, spatiotemporal patterns, an advanced computational model, and spiking neurons for single spike encoded pattern recognition for neural information processing and learning applied to digit recognition are featured in [14].

Tapson [15] describes a neural network synthesis method to generate synaptic connectivity and process time-coded neural signals, allowing the user to define axonal and dendritic delays, as well as synaptic transfer functions, to compute dendritic weights. They used leaky integrator, nonlinear leaky integrator, alpha functions, resonant dendrites, and fixed axonal or dendritic delays for voice digit recognition. A multineural spike pattern detection configuration is presented in [16]. The structure is capable of autonomously implementing online learning and the recognition of parallel spike sequences, mapping a spatiotemporal stimulus in a multidimensional, temporal, and feature space. The scheme uses a leaky integrate-and-fire (LIF) with latency neuron model and a self-tuning mechanism with a time-amplitude conversion operated by the spike latency feature and is applied to experimental data obtained from a motor-inhibitory cognitive task as a method of classification.

Spiking Neural Networks (SNN) are designed for the recognition of hand gesture patterns using a surface electromyography signal [17]. In [18], the authors propose a curiosity-based learning method for SNN based on the LIF neuron model and an estimation of samples during the learning process to deal with spatiotemporal information and the time-consuming computation for the data set of hand-written digit recognition. A method with deep convolutional SNN training with input spike events and spike-based backpropagation learning is proposed in [19]. The SNN components are LIF neurons, which generate bipolar spikes [–1,1], for image recognition tasks. Verma [20] developed an algorithm for the future generation of image frames with ANFIS, a neural network based on the Takagi–Sugeno fuzzy inference system, on space–time framework, obtaining successful results.

Temporal spike learning is presented in the auditory, the visual, and the motor control information processing of the brain. It is essentially applied in neuroprosthetics, along with applications for fast, real-time recognition and control of the sequence of related processes. Temporal coding of information in SNN uses the exact time of spikes. Every spike and its timing are important [21]. Learning rules have been applied to train neurons to associate spatiotemporal spike patterns, or spike pattern recognition, for example, optical character recognition, and coding methods to obtain spike patterns from images. Spiking neurons process spikes or action potentials, such as biological neurons for the transmission of information and cognitive abilities such as recognition, in neural systems. Information from external stimuli to the neuron is represented by spikes, called neural coding. There is evidence that neural systems transmit information by the timing of the spikes, producing a temporal code. A complex external stimulus can fire different sensory neurons to generate a spike at different times, creating a spatiotemporal representation of the stimulus. To process spatiotemporal spike patterns, learning algorithms have been developed: the tempotron rule, the perceptron rule, and the precise-spike-driven rule [22].

SNN processes the input information and produces fast and accurate recognition of a particular pattern. Two schemes can be distinguished, the rate coding and the temporal coding. Rate coding or the rate of firing communication explains that the intensity of stimuli increases the rate of actions. Temporal encoding in spike training carries pattern information faster than rate encoding. An iris data set is taken to classify patterns using temporal encoding learning effectively and efficiently [23]. A sparse temporal coding method for processing spatial and temporal information is developed in [24]. A spike-timing-dependent back-propagation-based SNN classifier achieves high efficiency of parallel computing for feature extraction. Image classification and speech recognition applications show the SNN model yields state-of-the-art accuracy.

In [25], an Augmented Spiking Neuron Model and synaptic learning rules are proposed, with spike coefficients in addition to spike latencies to process and learn patterns of augmented spikes. The temporal-based approach effectively improves the accuracy of visual and acoustic pattern recognition tasks. Activation functions or membership functions with different forms, such as those by Sigmoid, Gaussian, Parabola, and Sigmoid–Gaussian, among others, have been developed in Zhao and Bose, Basterretxea et al., Xie et al., and Ramirez-Mendoza et al. [26,27,28,29,30,31,32].

A new spike activation function (SPKAF) for fuzzy neurons with dendritic inputs at unipolar context [0,1] and bipolar spike response at [–1,1] is proposed here. This article proposes a solution for a fuzzy neural system decoding a sequence of a spatiotemporal spike pattern that arrives at a neural system and leads to the conduction of other neurons. If encoded information corresponds to a spatiotemporal spike pattern determined or referenced, then the threshold of the neuron is exceeded, and a spike is generated. SPKAF is applied to the system to generate spikes and decode the information.

Solution methods based on an adaptive network-based fuzzy inference system (ANFIS) and fuzzy adaptive neuron (FAN)-SPKAF are proposed for pattern recognition problem. The ANFIS method consists of fuzzy inference rules and the FAN learning algorithm [33] to train the parameters or weights. Instead of sums at the last layers of the system, ANFIS uses a fuzzy logical operator, MAX. The FAN-based method includes the fuzzy logical operator MIN.

The main contribution of this work is the new SPKAF to perform pattern recognition and spike-time-encoded information. The FAN-SPKAF solution method comprises a FAN, axonal delays of previous neurons for the input channels of spike pattern sequences to be decoded, fuzzy logical operators, and the SPKAF. Among the advantages of the FAN-SPKAF are the processing of information with digital pulse trains, prior to the generation of analog spikes, and that the information is precisely encoded because the duration and shape of each analog spike generated based on the digital pulse trains is constant.

The problem of voice recognition using solfeggio syllables in Spanish is solved by comparing the methods of FAN-activation function step-type (STEPAF)-SPKAF, the Augmented Spiking Neuron Model, and Augmented FAN-STEPAF-SPKAF. This article is divided into the following sections: Section 2 presents the methods, a description of the FAN model with a Gupta Integrator [7,8,9], and the new SPKAF model for fuzzy neurons from Ramírez-Mendoza et al. [5,10,11,29,32,33,34,35]. Section 3 describes the models and algorithms of fuzzy systems based on ANFIS and FAN-SPKAF for the processing of spike-time-encoded information and pattern recognition. The FAN-STEPAF-SPKAF model is applied to voice recognition based on solfeggio syllables in Spanish and compared with the Augmented Spiking Neuron Model and Augmented FAN-STEPAF-SPKAF methods.

Section 4 develops the model analysis for systems of Section 3.1, Section 3.2 and Section 3.3, with nonlinear systems fuzzy modeling, fuzzy system training, and stability analysis. Section 5 presents the simulation results of the comparison of methods for the fuzzy systems models applied to the recognition of spatiotemporal spike patterns and voice recognition using solfeggio syllables in Spanish. Section 6 describes the conclusions and discussion.

2. Methods

2.1. Fuzzy Adaptive Neurons

The sigmoid activation function (SAF) was proposed for FAN, for unipolar FAN, and another for bipolar in [5,10,11,33]. This article presents a new SPKAF also for fuzzy neurons. The unipolar FAN model with SAF is briefly presented below.

For unipolar inputs at interval [0,1], with the time variable

k

, the synaptic operation of inputs

z_{i n j} (k)

and adaptive weights

w_{i n j} (k)

is presented in (1), the aggregation operation in (2), and the nonlinear operation with threshold in (3) and (4). The somatic operations of the FAN and the error

e (k)

are described in (1)–(5):

{\tilde{V}}_{m i n j} (k) = \min (z_{i n j} (k), w_{i n j} (k))

(1)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} {\tilde{V}}_{m i n j} (k)

(2)

{\tilde{V}}_{o u t} (k) = \max ({\tilde{V}}_{m a x} (k), V_{t h r e s h o l d} (k))

(3)

{\tilde{y}}_{S A F} (k) = \frac{1}{1 + e^{(- m i n (γ, {\tilde{V}}_{o u t} (k)) \cdot a + b)}}

(4)

e (k) = {\tilde{y}}_{S A F r e f} (k) - {\tilde{y}}_{S A F} (k)

(5)

2.2. Spike Activation Function Model

Different membership functions and approximations of activation functions have been proposed for various applications, such as seismic modeling of accelerograms and control, among others [26,27,28,29,30,31,32]. Following is the development of a spike membership function for unipolar inputs

z_{i n j}

, shown in Figure 1.

The spike membership function is as follows:

{\tilde{V}}_{o u t γ} (k) = m i n (γ, {\tilde{V}}_{o u t} (k))

(6)

{\tilde{y}}_{S P K A F} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}}

(7)

3. Models for the Processing of Spike-Time-Encoded Information and Pattern Recognition

Models and theoretical analysis of fuzzy neural systems based on ANFIS and FAN-SPKAF methods for pattern recognition of spatiotemporal spike sequences are developed below. The spike pattern input reference and the input spike sequences are generated with FAN-SPKAF.

Memory is implemented by identifying axonal delays of the reference input spike sequences coming from other neurons, such that the maximum delay is that of the input spike with the maximum axonal delay and can be compared at the same time with the other input spikes of the other input channels.

One spike from each channel is compared or presented at the system at a time. This process is performed as many times as the highest number of subsequent spikes in the same channel, and only if it matches the reference input spike sequence, then fuzzy systems based on ANFIS, or the FAN-SPKAF and MIN operator will generate a spike.

3.1. ANFIS Model

The fuzzy system consists of a stage with axonal delays for the reference spike sequences from the previous neurons and an ANFIS with SAF unipolar membership functions. A MAX operator is used instead of sums, and a gain is added (see Figure 2).

3.1.1. Algorithm

If we have the reference spike pattern sequences of the input channels, the maximum spike delay

{Δ k}_{r e f i n p u t s p i k e m a x}

of those spike sequences is determined. The fuzzy neural system can generate an output or spike until after that maximum delay time

{Δ k}_{r e f i n p u t s p i k e m a x}

,

y_{A N F I S s p i k e} (k < {Δ k}_{r e f i n p u t s p i k e m a x}) = 0

.

All spikes of the input sequences to the neural system will be processed at that instant

{k + Δ k}_{r e f i n p u t m a x}

; therefore, an axonal delay will be given to each of the spikes of the input sequences of the input channels,

{Δ k}_{i j a x o n d e l a y} = {Δ k}_{r e f i n p u t m a x} - {Δ k}_{i j i n p u t s p i k e}

, where

i = 1, 2, \dots, m

;

m

is the number of channels and

j = 1, 2, \dots, n

;

n

is the number of spikes at the reference input sequences of each channel.

3.1.2. Adaptive Network Based Fuzzy Inference System (ANFIS)

ANFIS is based on the Takagi–Sugeno fuzzy inference system [21]. In this inference system, fuzzy IF–THEN rules use a learning algorithm. There are five input channels with spike sequences,

x_{1}, x_{2}, x_{3}, x_{4}, x_{5}

, in the fuzzy inference system that describes a Sugeno fuzzy model with the following inference rules:

\forall k \geq {Δ k}_{r e f i n p u t m a x}

,

R u l e 1 : I f x_{1} (k - {Δ k}_{1 j a x o n d e l a y}) i s A_{1} t h e n y_{1} (k) = {M A X (p}_{1} \cdot x_{1} (k - {Δ k}_{1 j a x o n d e l a y}), q_{1}) R u l e 2 : I f x_{2} (k - {Δ k}_{2 j a x o n d e l a y}) i s A_{2} t h e n y_{2} (k) = {M A X (p}_{2} \cdot x_{2} (k - {Δ k}_{2 j a x o n d e l a y}), q_{2}) R u l e 3 : I f x_{2} (k - {Δ k}_{2 j a x o n d e l a y}) i s A_{3} a n d x_{3} (k - {Δ k}_{3 j a x o n d e l a y}) i s A_{4} a n d x_{4} (k - {Δ k}_{4 j a x o n d e l a y}) i s A_{5} t h e n y_{3} (k) = M A X (p_{3} \cdot x_{2} (k - {Δ k}_{2 j a x o n d e l a y}), {q_{3} \cdot x_{3} (k - {Δ k}_{3 j a x o n d e l a y}), r_{3} \cdot x_{4} (k - {Δ k}_{4 j a x o n d e l a y}), s}_{3}) R u l e 4 : I f x_{3} (k - {Δ k}_{3 j a x o n d e l a y}) i s A_{6} a n d x_{5} (k - {Δ k}_{5 j a x o n d e l a y}) i s A_{7} t h e n y_{4} (k) = {M A X (p}_{4} \cdot x_{3} (k - {Δ k}_{3 j a x o n d e l a y}), q_{4} \cdot x_{5} (k - {Δ k}_{5 j a x o n d e l a y}), r_{4}) R u l e 5 : I f x_{3} (k - {Δ k}_{3 j a x o n d e l a y}) i s A_{8} a n d x_{5} (k - {Δ k}_{5 j a x o n d e l a y}) i s A_{9} t h e n y_{5} (k) = {M A X (p}_{5} \cdot x_{3} (k - {Δ k}_{3 j a x o n d e l a y}), q_{5} \cdot x_{5} (k - {Δ k}_{5 j a x o n d e l a y}), r_{5})

Layer 1: Every node could be adaptive with a node membership function, from (4):

O_{l}^{1} = μ_{A_{l}} (x_{i})

(8)

μ_{A_{l}} (x_{i}) = μ_{S A F A_{l}} (x_{i})

(9)

μ_{S A F A_{l}} (x_{i}) = \frac{1}{1 + e^{(- m i n (γ, \max ({M A X}_{j = 1}^{N} \min (x_{i} (k - {Δ k}_{i j a x o n d e l a y}), w_{i} (k)), V_{t h r e s h o l d} (k))) \cdot a + b)}}

(10)

where

w_{i} (k) = 1; l = 1, 2, \dots, 9

;

i = 1, 2, \dots, 5

.

Layer 2: Each node is fixed, and outputs are the product of all incoming inputs:

O_{i}^{2} = ω_{i} = μ_{A_{l}} (x_{i}) \cdot μ_{A_{l + m}} (x_{i}) \cdot \dots \cdot μ_{A_{l + m}} (x_{i})

(11)

where

m = 1, 2, \dots, 8; l = 1, 2, \dots, 9

;

i = 1, 2, \dots, 5

.

Layer 3: Each node is fixed, and outputs are the ratio:

O_{i}^{3} = {\bar{ω}}_{i} = \frac{ω_{i}}{ω_{1} + ω_{2} + \dots + ω_{5}}

(12)

where

i = 1, 2, \dots, 5

.

Layer 4: Each node is adaptive with a node function:

O_{i}^{4} = {\bar{ω}}_{i} \cdot y_{i} (k) = {\bar{ω}}_{i} \cdot M A X (p_{i} \cdot x_{i}, {q_{i} \cdot x_{i}, r_{i} \cdot x_{i}, s}_{i})

(13)

where

i = 1, 2, \dots, 5

.

Layer 5: Each node is adaptive with a node function,

O_{s p i k e}^{5} = o v e r a l l o u t p u t = \sum {\bar{ω}}_{i} \cdot y_{i} (k) = \frac{\sum ω_{i} \cdot y_{i} (k)}{\sum ω_{i}}

(14)

y_{s p i k e A N F I S} (k) = O_{s p i k e}^{5}

(15)

Adjusting the amplitude with a gain,

y_{s p i k e A N F I S} (k) = O_{s p i k e}^{5} \cdot g_{1}

(16)

where

i = 1, 2, \dots, 5

;

{\bar{ω}}_{i}

is a normalized firing strength and

p_{i}, {q_{i}, r_{i}, s}_{i}

are adaptive parameters.

3.2. FAN-SPKAF Model

The model of a fuzzy system is presented with the new SPKAF developed in Section 2 for the recognition of spike-time-encoded sequence patterns. If the reference input spike pattern exceeds the SPKAF threshold, then the FAN output is activated, generating a spike.

The fuzzy system consists of a stage with axonal delays for the reference spike sequences from the previous neurons, a fuzzy logic operator MIN, a comparator, and a FAN-SPKAF with threshold; a gain and an adjustment constant are also added. If the dendrite input exceeds the threshold, the FAN-SPKAF activates with the input spike pattern, thereby generating an output spike, the FAN-SPKAF response, as shown in Figure 3.

After a spike is obtained from the fuzzy operator MIN, it is fuzzified or scaled from the bipolar interval [–1,1] to the unipolar interval [0,1], with a comparator taking the positive part of the spike. A unipolar ramp is generated, and then it is limited to the unipolar interval by means of a trimmer, thereby obtaining the input to the dendritic synapse of the fuzzy neuron. If the potential of the soma exceeds the threshold, the SPKAF of the FAN generates a spike, thus performing the recognition of spatiotemporal spike patterns.

3.2.1. Algorithm

If we have the reference spike pattern sequences of the input channels, the maximum spike delay

{Δ k}_{r e f i n p u t s p i k e m a x}

of those spike sequences is determined. The fuzzy neural system can generate an output or spike until after the maximum delay time

{Δ k}_{r e f i n p u t s p i k e m a x}

,

y_{A N F I S s p i k e} (k < {Δ k}_{r e f i n p u t s p i k e m a x}) = 0

.

All spikes of the input sequences to the neural system will be processed at the instant

{k + Δ k}_{r e f i n p u t m a x}

; therefore, an axonal delay will be given to each of the spikes of the input sequences of the input channels,

{Δ k}_{i j a x o n d e l a y} = {Δ k}_{r e f i n p u t m a x} - {Δ k}_{i j i n p u t s p i k e}

, where

i = 1, 2, \dots, m

;

m

is the number of channels and

j = 1, 2, \dots, n

;

n

is the number of spikes at the reference input sequences of each channel.

3.2.2. Fuzzy Adaptive Neuron with Spike Activation Function (FAN-SPKAF)

The FAN uses a nonlinear somatic threshold operation, an activation function, or a membership function. In Section 3, the development of the spike shaped SPKAF was presented. Once axonal delays are given to the input spike sequences, a MIN operator is applied. If all the spikes of the input spike pattern sequences are present, then the MIN operator responds with one spike. Subsequently, a comparator with zero is used to obtain the positive part of the spike or zero. This result is the input to the FAN-SPKAF with fixed weights. Every time the FAN-SPKAF receives the positive part of a spike, it emits a bipolar spike.

z (k) = {M I N}_{i = 1}^{N} x_{i} (k - {Δ k}_{i a x o n d e l a y})

(17)

where

i = 1, 2, \dots, 5

.

If

z (k) > 0 t h e n

{\tilde{V}}_{m i n} (k) = \min (z (k), w (k))

(18)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} {\tilde{V}}_{m i n j} (k)

(19)

{\tilde{V}}_{o u t} (k) = \max ({\tilde{V}}_{m a x} (k), V_{t h r e s h o l d} (k))

(20)

{\tilde{V}}_{o u t γ} (k) = m i n (γ, {\tilde{V}}_{o u t} (k))

(21)

{\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}}

(22)

Adjusting the amplitude with a gain,

{\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}} \cdot g_{2}

(23)

else

{\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = 0

.

3.3. FAN-STEPAF-SPKAF Model for Syllable-Based Speech Recognition Application

For speech pattern recognition, a differentiation can be made between the recognition of individual speakers and the recognition of the vowels, consonants, and syllables of one or more speakers. Here, we present the SPKAF applied to the recognition of voice patterns to identify a syllable or musical note, using the first musical notes of a classical melody.

DO, RE, MI, FA, SOL, LA, and SI define the musical notes in Spanish; in English, they are C, D, E, F, G, A, and B. A voice sample was taken from a Spanish-speaker who read the first musical notes of Friederich von Schiller/Ludwig van Beethoven’s “Ode to Joy”.

The voice has different characteristics, articulation, tone, timbre, and volume. Algorithms of the speaker-dependent and speaker-independent type based on fuzzy logic have been presented for the recognition of speech patterns [36]. The main problems are (1) fluctuation of pattern length (intonation) caused by variance of utterance speed, and (2) frequency caused by differences of the vocal tract (timbre). Articulation is made by changing vocal tract, and then the information required is added at the sound wave. In this article, the speaker-dependent type is presented, the pattern recognition method samples the voice of the operator syllable by syllable. An SPKAF application developed in this article is shown.

Other methods have been applied to pattern recognition, for example, the approximate reasoning method for pattern recognition, which consists of fuzzy implications and a composition inference rule [37].

Voice qualities are different for each speaker and use temporal structures, articulation precision, vocal effort, and type of phonation. Vocal effort has to do with sound quality. For phonation, it is necessary to separate the quality of the voice from the linguistic content; the study is based on the phonemes [38].

In [39], the authors study the types of syllables in a language, vowel–consonant, consonant–vowel, long vowel–vowel, consonant–consonant, and the frequency of the syllable patterns.

Speech pattern recognition [40] has been applied in speech conversion methods for people with articulation disorders, based on spectral conversion using non-negative matrix factorization with very good results.

Different approaches to membership functions have been proposed in recent years. A new membership function for uncertain functions is developed in [41], based on cuts determined by an integral membership function, and defined by a continuous auxiliary function design, depending on the uncertainty of the function. In [29], a new approach of the sigmoid membership function proposed for FAN and fuzzy systems is developed using the Newton binomial, with exponential series to obtain its Laplace transform, with application to control systems.

The syllable-based pattern recognition information, for the application proposed here, is contained in the form of magnitudes or articulation of the voice, rather than frequency or timbre. The short-time Fourier transform (STFT) is obtained with a Matlab^® command. The maximum operator is applied to the magnitudes, a FAN-STEPAF [10,11,29,33], then a FAN-SPKAF, and a comparator with the syllable pattern of spikes to obtain the recognized spike speech pattern, as shown in Figure 4.

From a speech sample

x (k)

, the STFT is obtained with magnitude and phase:

S T F T \{x (k)\} = X [n, k] = \sum_{m = k}^{k + N - 1} x [m] \cdot w [m - k] \cdot e^{- i \cdot 2 \cdot π \cdot n \cdot (m - k) / N}

(24)

where

n = 0, 1, \dots, N - 1

is the frequency index.

From

X [n, k]

, the absolute value of the magnitude is obtained:

X_{m a g} [n, k] = |X [n, k]|

(25)

Then the MAX operator is applied to each segment:

X_{M A X m a g} [k] = {M A X (X}_{m a g} [n, k])

(26)

Via membership or STEPAF of the FAN [10,11,29,33], square pulses at [0,1] are obtained:

z (k) = X_{M A X m a g} [k]

(27)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} m i n (z_{j} (k), w_{j} (k))

(28)

{\tilde{y}}_{s t e p} (k) = \{\begin{matrix} 1 i f {\tilde{V}}_{m a x} (k) \geq V_{t h r e s h o l d} (k) \\ 0 i f {\tilde{V}}_{m a x} (k) < V_{t h r e s h o l d} (k) \end{matrix}

(29)

where

N = 1, w (k) = 1, V_{t h r e s h o l d} (k)

is the triangular shaped threshold input.

From (1)–(3), (6) and (7), we can obtain,

If

{\tilde{y}}_{s t e p} (k) > 0 T H E N

{z_{i n} (k) = \tilde{y}}_{s t e p} (k)

(30)

{\tilde{V}}_{m i n j} (k) = \min (z_{i n j} (k), w_{i n j} (k))

(31)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} {\tilde{V}}_{m i n j} (k), N = 1

(32)

{\tilde{V}}_{o u t} (k) = \max ({\tilde{V}}_{m a x} (k), V_{t h r e s h o l d} (k)), V_{t h r e s h o l d} (k) = 0

(33)

{\tilde{V}}_{o u t γ} (k) = m i n (γ, {\tilde{V}}_{o u t} (k)), γ = 1

(34)

{\tilde{y}}_{S P K A F} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}}

(35)

ELSE

{\tilde{y}}_{S P K A F} (k) = 0

Note: all spikes have the same shape and period.

If

{\tilde{y}}_{S P K A F} (k) = {\tilde{y}}_{S P K A F S I} (k) T H E N

{\tilde{y}}_{S P K A F I D E N T I F I E D} (k) = {\tilde{y}}_{S P K A F} (k)

(36)

ELSE

{\tilde{y}}_{S P K A F I D E N T I F I E D} (k) = 0

where

{\tilde{y}}_{S P K A F S I} (k)

represents the spikes in the voice pattern of the syllable or musical note SI of solfeggio in Spanish.

To compare the FAN-STEPAF-SPKAF model, the methods of Section 3.4 and Section 3.5 are presented.

3.4. Augmented Spiking Neuron Model for Syllable-Based Speech Recognition Application

Syllable-based pattern recognition information is obtained based on the maximum of the STFT magnitudes and the Augmented Spiking Neuron Model method [25].

The reference spike pattern of the SI syllable

{\tilde{y}}_{A S N M S I}

is compared with the first spikes of the melody

{\tilde{y}}_{A S N M}

. The input samples of the same interval of the SI syllable to be identified are stored in memory, and these are compared with the reference pattern. The mean squared error (MSE) is determined, and if the MSE is greater than a user-defined MSE, then the syllable was identified and it continues with the next sample; if not, then the pattern recognition is null, and the same sequence is carried out, advancing one more sample until the pattern recognition of the whole melody

{\tilde{y}}_{A S N M I D E N T I F I E D}

is concluded, similar to Figure 4.

From (26) and (29) and [25], we can determine the following:

{\tilde{y}}_{i n} (k) = {\tilde{y}}_{s t e p} (k) \cdot X_{M A X m a g} [k]

(37)

V_{K e r n e l} (k - k_{i}^{j}) = c_{0} \cdot [e x p (- \frac{k - k_{i}^{j}}{κ_{m}}) - e x p (- \frac{k - k_{i}^{j}}{κ_{s}})]

(38)

V_{A S N M} (k) = \sum_{i = 1}^{N} w_{i} \cdot \sum_{k_{i}^{j} < k} c_{i}^{j} {\cdot V}_{K e r n e l} (k - k_{i}^{j}) - ϑ \cdot \sum_{k_{s}^{j} < k} e x p (- \frac{k - k_{s}^{j}}{κ_{m}})

(39)

∆ w_{i} = \{\begin{matrix} η \cdot \sum_{k_{i}^{j} < k_{m a x}} c_{i}^{j} {\cdot V}_{K e r n e l} (k - k_{i}^{j}), i f P^{+} e r r o r > 0 \\ - η \cdot \sum_{k_{i}^{j} < k_{m a x}} c_{i}^{j} {\cdot V}_{K e r n e l} (k - k_{i}^{j}), i f P^{-} e r r o r < 0 \\ 0, o t h e r w i s e \end{matrix}

(40)

Substituting (37) in (38) and (39), considering

w_{i} = 1

and

ϑ = 0

in (41) and (42),

V_{K e r n e l} ({\tilde{y}}_{i n} (k)) = q_{0} \cdot [e x p (- \frac{{\tilde{y}}_{i n} (k)}{κ_{m}}) - e x p (- \frac{{\tilde{y}}_{i n} (k)}{κ_{s}})]

(41)

{\tilde{y}}_{A S N M} (k) = c_{i} {\cdot V}_{K e r n e l} ({\tilde{y}}_{i n} (k))

(42)

i = 1

(43)

while

i \leq Q

M S E = \frac{1}{M} \cdot \sum_{i = 1}^{M} {({{\tilde{y}}_{A S N M S I} (k_{1}) \dots {\tilde{y}}_{A S N M S I} (k_{M})} - \{{\tilde{y}}_{A S N M} (k_{i}) \dots {\tilde{y}}_{A S N M} (k_{i + M - 1})\})}^{2}

(44)

IF

M S E > r T H E N

{{\tilde{y}}_{A S N M I D E N T I F I E D} (k_{i}) \dots {\tilde{y}}_{A S N M I D E N T I F I E D} (k_{i + M - 1})} = {{\tilde{y}}_{A S N M} (k_{i}) \dots {\tilde{y}}_{A S N M} (k_{i + M - 1})}

(45)

i = i + M - 1

(46)

ELSE

{\tilde{y}}_{A S N M I D E N T I F I E D} (k_{i}) = 0

(47)

i = i + 1

(48)

End

3.5. Augmented FAN-STEPAF-SPKAF Model for Syllable-Based Speech Recognition Application

Spike patterns recognition of the SI syllable in a melody is performed by obtaining the maximum of the STFT magnitudes and the Augmented FAN-STEPAF-SPKAF method [29,30,31,32,33,34,35].

Based on the reference spike pattern

{\tilde{y}}_{S P K A F S I}

of the SI syllable, it is compared with the first spikes of the melody

{\tilde{y}}_{S P K A F}

. The input samples of the same interval of the SI syllable to be identified are stored in memory and compared with the reference pattern. If the MSE is less than a user-defined MSE, then the syllable was identified and it continues with the next sample; if not, then the pattern recognition is null, and the same sequence is carried out, advancing one more sample until the pattern recognition of the whole melody

{\tilde{y}}_{A S P K A F I D E N T I F I E D}

is concluded, similar to Figure 4.

From (26) and (29), we can determine the following:

{\tilde{y}}_{s a w t o o t h} (k) = {\tilde{y}}_{s t e p} (k) \cdot X_{M A X m a g} [k] \cdot y_{t r i a n g u l a r} (k)

(49)

{z_{i n} (k) = \tilde{y}}_{s a w t o o t h} (k)

(50)

{\tilde{V}}_{m i n j} (k) = \min (z_{i n j} (k), w_{i n j} (k))

(51)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} {\tilde{V}}_{m i n j} (k), N = 1

(52)

{\tilde{V}}_{o u t} (k) = \max ({\tilde{V}}_{m a x} (k), V_{t h r e s h o l d} (k)), V_{t h r e s h o l d} (k) = 0

(53)

{\tilde{V}}_{o u t γ} (k) = m i n (γ, {\tilde{V}}_{o u t} (k)), γ = 1

(54)

{\tilde{y}}_{A S P K A F} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}} + c_{3}

(55)

i = 1

(56)

while

i \leq Q

M S E = \frac{1}{M} \cdot \sum_{i = 1}^{M} {({\tilde{y}}_{A S P K A F S I} (k_{1}) \dots {\tilde{y}}_{A S P K A F S I} (k_{M})} - \{{\tilde{y}}_{A S P K A F} (k_{i}) \dots {\tilde{y}}_{A S P K A F} (k_{i + M - 1})\})}^{2}

(57)

IF

M S E > r T H E N

{\tilde{y}_{A S P K A F I D E N T I F I E D} (k_{i}) \dots {\tilde{y}}_{A S P K A F I D E N T I F I E D} (k_{i + M - 1})} = {{\tilde{y}}_{A S P K A F} (k_{i}) \dots {\tilde{y}}_{A S P K A F} (k_{i + M - 1})}

(58)

i = i + M - 1

(59)

ELSE

{\tilde{y}}_{A S P K A F I D E N T I F I E D} (k_{i}) = 0

(60)

i = i + 1

(61)

End

4. Model Analysis

4.1. Nonlinear Systems Fuzzy Modeling

According to the study presented in [31] for the stability analysis of adaptive neurons, it is considered that a nonlinear system in discrete time can be represented as

y (k) = Φ [u (k - 1), u (k - 2), \dots, u (k - n)]

(62)

where

Φ (\cdot)

is a nonlinear function, the system dynamics

u (k)

and

y (k)

are the input and output of the system, and

n

is a number. Multivariable NARMA (nonlinear autoregressive-moving average) form,

Y (k) = Φ [H (k)]

(63)

where

H (k) = {[U (k - d_{t}), \dots]}^{T}

U (k) = {[u (k)]}^{T}

(64)

Y (k) = {[y (k)]}^{T}

(65)

where

d_{t}

is time delay. To analyze the system models presented in Section 3 and avoid some problems, such as slow convergence and difficulty designing parameters, we consider, from (7) and (64), and Figure 1,

u (k) = {\tilde{y}}_{S P K A F} (k)

(66)

x (k) = u (k)

(67)

The subsections below present the analysis of fuzzy systems from Section 3.1.1, Section 3.2.1, and Section 3.3.

4.1.1. Analysis of ANFIS Model

If we normalize the input

U (k), U (k - d_{t})

and reference output

y_{s p i k e r e f} (k)

of the unknown system (3.1.1) into [0, 1], then,

R^{i} : IF h_{1} < 0 or h_{2} < 0 or \dots h_{n} < 0

then

{\hat{h}}_{1} = m a x {(h}_{1}, r_{e}), {\hat{h}}_{2} = m a x {(h}_{2}, r_{e}), \dots, {\hat{h}}_{n} = m a x {(h}_{n}, r_{e})

, where

r_{e} \in R

.

We use the somatic Gupta-type aggregation (MAX operator) and the following fuzzy IF–THEN rules for the unknown system (3.1.1.1):

{R u l e}^{i} : I f x_{j} (k - {Δ k}_{j l a x o n d e l a y}) i s A_{m} a n d x_{j + 1} (k - {Δ k}_{j + 1 l a x o n d e l a y}) i s A_{m + 1} \dots a n d x_{j + n} (k - {Δ k}_{j + n l a x o n d e l a y}) i s A_{m + n} t h e n y_{i} (k) = M A X (p_{i} \cdot x_{j} (k - {Δ k}_{j l a x o n d e l a y}), {q_{i} \cdot x_{j + 1} (k - {Δ k}_{j + 1 l a x o n d e l a y}), \dots, r_{i} \cdot x_{j + n} (k - {Δ k}_{j + n l a x o n d e l a y}), s}_{i})

(68)

μ_{A_{m}} (x_{j}) = μ_{S A F A_{m}} (x_{j})

(69)

O_{j}^{1} = μ_{A_{m}} (x_{j})

(70)

O_{j}^{2} = ω_{j} = μ_{A_{m}} (x_{j}) \cdot μ_{A_{m + 1}} (x_{j}) \cdot \dots \cdot μ_{A_{m + n}} (x_{j})

(71)

where

j = 1, 2, \dots, 5; m = 1, 2, \dots, 9; n = 1, 2, \dots, N

.

O_{j}^{3} = {\bar{ω}}_{j} = \frac{ω_{j}}{ω_{1} + ω_{2} + \dots + ω_{5}}

(72)

O_{j}^{4} = {\bar{ω}}_{j} \cdot y_{j} (k) = {\bar{ω}}_{j} \cdot M A X (p_{j} \cdot x_{j}, {q_{j} \cdot x_{j}, r_{j} \cdot x_{j}, s}_{j})

(73)

Because of the fuzzy unipolar system

ϵ

[0,1], the following synaptic operation, aggregation operation, and nonlinear operation are also the somatic operations:

{\tilde{V}}_{m i n i} (k) = \min (x_{j} (k - {Δ k}_{j l a x o n d e l a y}), w_{j} (k))

(74)

{\tilde{V}}_{\max j} (k) = {M A X}_{i = 1}^{N} {\tilde{V}}_{m i n i} (k)

(75)

{\tilde{V}}_{o u t j} (k) = \max ({\tilde{V}}_{m a x j} (k), V_{t h r e s h o l d j} (k))

(76)

{\tilde{y}}_{S A F j} (k) = \frac{1}{1 + e^{(- m i n (γ, {\tilde{V}}_{o u t j} (k)) \cdot a + b)}}

(77)

O_{s p i k e}^{5} = o v e r a l l o u t p u t = \sum {\bar{ω}}_{j} \cdot {\tilde{y}}_{j} (k) = \sum {\bar{ω}}_{j} \cdot {M A X (p}_{j} \cdot x_{j}, {q_{j} \cdot x_{j}, r_{j} \cdot x_{j}, s}_{j}) = \frac{\sum ω_{j} \cdot {\tilde{y}}_{j} (k)}{\sum ω_{j}}

(78)

{\tilde{y}}_{s p i k e A N F I S} (k) = O_{s p i k e}^{5}

(79)

{\tilde{y}}_{s p i k e A N F I S} (k) = O_{s p i k e}^{5} \cdot g_{1}

(80)

e_{s p i k e A N F I S} (k) = y_{s p i k e A N F I S r e f} (k) - {\tilde{y}}_{s p i k e A N F I S} (k)

(81)

where

j = 1, 2, \dots, 5

.

Based on the above operations, the unknown nonlinear system can be expressed by

{\tilde{Y}}_{S A F 1 j} (k) = Φ [\hat{H} (k), W_{1} (k)]

(82)

{\tilde{Y}}_{S A F 2 j} (k) = Φ [\hat{H} (k), W_{2} (k)]

(83)

{\hat{Y}}_{A N F I S} (k) = Φ [\hat{H} (k), W_{1} (k)] \cdot \hat{H} (k) \cdot W_{2} (k)

(84)

where

W_{1} (k) = [1, \dots, 1]

is the proposed fixed weights,

W_{2} (k) = [p_{j}, {q_{j}, r_{j}, s}_{j}]

is the weights to be obtained, and

Φ (\cdot)

is the nonlinear function corresponding to the membership functions of the fuzzy system.

4.1.2. Analysis of FAN-SPKAF model

For the nonlinear system of Section 3.2.1, with input

U (k), U (k - d_{t})

, and reference output or pattern

y_{s p i k e r e f} (k)

,

z (k) = {M I N}_{i = 1}^{N} x_{i} (k - {Δ k}_{i a x o n d e l a y})

(85)

where

i = 1, 2, \dots, 5

.

I f z (k) > 0 t h e n

{\tilde{V}}_{m i n} (k) = \min (z (k), w (k))

(86)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} {\tilde{V}}_{m i n j} (k)

(87)

{\tilde{V}}_{o u t} (k) = \max ({\tilde{V}}_{m a x} (k), V_{t h r e s h o l d} (k))

(88)

{\tilde{V}}_{o u t γ} (k) = m i n (γ, {\tilde{V}}_{o u t} (k))

(89)

{\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}}

(90)

Adjusting the amplitude with a gain,

{\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}} \cdot g_{2}

(91)

e l s e {\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = 0

e_{s p i k e F A N - S P K A F I D E N T I F I E D} (k) = y_{s p i k e F A N - S P K A F r e f} (k) - {\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k)

(92)

Remark 1.

FAN-SPKAF uses the same learning algorithm for unipolar fuzzy systems,

\in

[0,1], developed for FAN-SAF in [16,31,33]. Because the spikes are bipolar, to normalize them to the unipolar interval, a max operator and a zero reference or a comparator with square output pulses

\in

[0,1] could be used. The spike period and the absolute refractory period [3], are considered constant.

Based on the above operations, the nonlinear system can be expressed by

{\hat{Y}}_{F A N - S P K A F} (k) = [Φ [\hat{H} (k), W (k)], 0]

(93)

where

W (k) = [1, \dots, 1]

is proposed fixed weights and

Φ (\cdot)

is the nonlinear function corresponding to the membership functions of the fuzzy system.

4.1.3. Analysis of FAN-STEPAF-SPKAF Model

From Section 3.3, the nonlinear system with input

{\tilde{y}}_{S P K A F} (k) = [U (k), U (k - d_{t})]

and reference output or pattern

{{\tilde{y}}_{S P K A F S I} (k) = y}_{s p i k e r e f} (k)

, where

{\tilde{y}}_{S P K A F S I} (k)

is a spike’s voice pattern of the syllable or musical note SI of solfeggio in Spanish.

From a speech sample

x (k)

, the short-time Fourier transform (STFT) is obtained with magnitude and phase:

S T F T \{x (k)\} = X [n, k] = \sum_{m = k}^{k + N - 1} x [m] \cdot w [m - k] \cdot e^{- i \cdot 2 \cdot π \cdot n \cdot (m - k) / N}

(94)

X_{m a g} [n, k] = |X [n, k]|

(95)

X_{M A X m a g} [k] = {M A X (X}_{m a g} [n, k])

(96)

z (k) = X_{M A X m a g} [k]

(97)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} m i n (z_{j} (k), w_{j} (k))

(98)

{\tilde{y}}_{s t e p} (k) = \{\begin{matrix} 1 i f {\tilde{V}}_{m a x} (k) \geq V_{t h r e s h o l d} (k) \\ 0 i f {\tilde{V}}_{m a x} (k) < V_{t h r e s h o l d} (k) \end{matrix}

(99)

where

N = 1, w (k) = 1, V_{t h r e s h o l d} (k)

threshold signal triangular shaped.

IF

{\tilde{y}}_{s t e p} (k) > 0 T H E N

{z_{i n} (k) = \tilde{y}}_{s t e p} (k)

(100)

{\tilde{V}}_{m i n j} (k) = \min (z_{i n j} (k), w_{i n j} (k))

(101)

{\tilde{V}}_{m a x} (k) = {M A X}_{j = 1}^{N} {\tilde{V}}_{m i n j} (k), N = 1

(102)

{\tilde{V}}_{o u t} (k) = \max ({\tilde{V}}_{m a x} (k), V_{t h r e s h o l d} (k)), V_{t h r e s h o l d} (k) = 0

(103)

{\tilde{V}}_{o u t γ} (k) = m i n (γ, {\tilde{V}}_{o u t} (k)), γ = 1

(104)

{\tilde{y}}_{S P K A F} (k) = \frac{c_{1}}{1 + e^{b} - a \cdot e^{b} \cdot {\tilde{V}}_{o u t γ} (k) + a^{2} \cdot e^{b} \cdot {{\tilde{V}}_{o u t γ} (k)}^{2} - a^{3} \cdot e^{b} \cdot c_{2} \cdot {{\tilde{V}}_{o u t γ} (k)}^{3}}

(105)

ELSE

{\tilde{y}}_{S P K A F} (k) = 0

Note: all spikes have the same shape and period.

IF

{\tilde{y}}_{S P K A F} (k) = {\tilde{y}}_{S P K A F S I} (k) T H E N

{\tilde{y}}_{S P K A F I D E N T I F I E D} (k) = {\tilde{y}}_{S P K A F S I} (k)

(106)

ELSE

{\tilde{y}}_{S P K A F I D E N T I F I E D} (k) = 0

e_{S P K A F I D E N T I F I E D} (k) = {\tilde{y}}_{S P K A F S I} (k) - {\tilde{y}}_{S P K A F I D E N T I F I E D} (k)

(107)

Remark 2.

FAN-SPKAF uses the same learning algorithm for unipolar fuzzy systems,

\in

[0,1], developed for FAN-SAF in [16,31,33]. Because the spikes are bipolar, to normalize them to the unipolar interval, a max operator and a zero reference or a comparator with square output pulses

\in

[0, 1] could be used. The spike period and the absolute refractory period [3], are considered constant.

Therefore, the nonlinear system can be expressed by

{\hat{Y}}_{s t e p} (k) = Φ [\hat{X} (k), W_{1} (k)]

(108)

\hat{H} (k) = Φ [{\hat{Y}}_{s t e p} (k), W_{2} (k)]

(109)

{\hat{Y}}_{S P K A F} (k) = [\hat{H} (k), 0]

(110)

where

W_{1} (k) = W_{2} (k) = [1, \dots, 1]

is proposed fixed weights and

Φ (\cdot)

is the nonlinear function corresponding to the membership functions of the fuzzy system.

4.2. Fuzzy System Training

Model (84) is based on input–output data of the system to be identified, and (83) applies the learning algorithm for FAN method. The fuzzy neural modeling main idea is to find the values of

W (k)

, such that output

\hat{Y} (k)

of the proposed model (84) can follow the

Y (k)

output of the nonlinear plant. The identification error is defined as follows:

e (k) = Y (k) - \hat{Y} (k)

(111)

FAN with spike, sigmoid, and step-type activation function share the same learning method, developed in [16,31,33], and presented below for unipolar fuzzy systems:

W (k + 1) = W (k) + Γ (k) \cdot E (k) \cdot H (k)

(112)

Γ (k + 1) = Γ (k) + Γ (k) \cdot E (k) \cdot H (k)

(113)

0 < Γ (k) \leq 1, 0 \leq W (k) \leq 1

(114)

w (k + 1) = w (k) + γ (k) \cdot e (k) \cdot h_{i n} (k)

(115)

γ (k + 1) = γ (k) + γ (k) \cdot e (k) \cdot h_{i n} (k)

(116)

4.3. Stability Analysis

The stability of the models proposed in Section 4.1 depends on the convergence of the learning algorithm of the FAN; therefore, the stability analysis is performed using (83) of the model (84), which develops similarly to the one presented in [31]:

Plant : y = Φ [H (k), W^{*}]

(117)

Model : \hat{y} = Φ [H (k), W (k)]

(118)

Error : y - \hat{y} = Φ [H (k) \cdot (W^{*} - W (k))]

(119)

Y (k) = Φ [H (k), W^{*}] + Φ [H_{d} (k), W_{d}^{*}] + d (t)

(120)

\hat{Y} (k) = Φ [H (k), W (k)]

(121)

e (k) = Φ [H (k), W^{*}] + Φ [H_{d} (k), W_{d}^{*}] - Φ [H (k), W (k)]

(122)

e (k) = Φ [H (k), \tilde{W} (k)] + Φ [H_{d} (k), W_{d}^{*}]

(123)

e (k) = Φ [H (k), \tilde{W} (k)] + μ_{d} (k)

(124)

where

\tilde{W} (k) = W^{*} - W (k)

, and

μ_{d} (k) = Φ [H_{d} (k), W_{d}^{*}]

.

We assume plants (83) and (84) are bounded-input–bounded-output (BIBO)-stable, i.e.,

y (k)

and

h (k)

in (83) are bounded. The membership function

Φ (\cdot)

is bounded. The following theorem provides the stability analysis for nonlinear system modeling with the fuzzy system.

Theorem 1.

If the unknown nonlinear system (62) is modeled by the fuzzy system (84), the membership functions are updated by (112)–(116), and then the modeling error

e (k)

is bounded. The normalized identification error,

E_{N} (k) = \frac{W_{N} (k + 1) - W_{N} (k)}{Γ_{N} (k) \cdot H (k)}

(125)

satisfies the following average performance:

\lim_{T \to \infty} \sup \frac{1}{T} \sum_{k = 1}^{T} {‖E_{N} (k)‖}^{2} \leq {m a x}_{k} [{‖Φ [H_{d} (k), W_{d}^{*}]‖}^{2}]

\lim_{T \to \infty} \sup \frac{1}{T} \sum_{k = 1}^{T} {‖E_{N} (k)‖}^{2} \leq {\bar{μ}}_{d}

(126)

Proof:

For unipolar systems with values in [0,1], the conditions for

W (k + 1)

and

Γ (k + 1)

are

IF W (k + 1) > 1 THEN W_{N} (k + 1) = 1 . IF W (k + 1) < 0 THEN W_{N} (k + 1) = 0 . ELSE W_{N} (k + 1) = W_{N} (k) + ∆ W_{N} (k) . IF Γ (k + 1) > 1 THEN Γ_{N} (k + 1) = 1 . IF Γ (k + 1) < 0 THEN Γ_{N} (k + 1) = n . ELSE Γ_{N} (k + 1) = Γ_{N} (k) + ∆ Γ_{N} (k) .

where

0 < n \leq 1

.

Therefore,

Γ_{N} (k + 1) = Γ_{N} (k) + Γ_{N} (k) \cdot E (k) \cdot H (k)

(127)

W_{N} (k + 1) = W_{N} (k) + Γ_{N} (k) \cdot E (k) \cdot H (k)

(128)

We selected a positive defined scalar

L_{k}

, defined as

L_{k} = {‖\tilde{W} (k)‖}^{2}

(129)

where

‖\cdot‖

denotes the Euclidean norm.

By the updating law (126), we have

\tilde{W} (k + 1) = \tilde{W} (k) + Γ (k) \cdot E (k) \cdot {H (k)}^{T}

(130)

Using the inequalities,

‖q + r‖ \leq ‖q‖ + ‖r‖, ‖q \cdot r‖ = ‖q‖ \cdot ‖r‖

For any

q

and

r

, by using (130) and

0 < Γ_{N} (k) \leq Γ (k) \leq 1

,

∆ L_{k} = L_{k + 1} - L_{k} = {‖\tilde{W} (k) + Γ (k) \cdot E (k) \cdot {H (k)}^{T}‖}^{2} - {‖\tilde{W} (k)‖}^{2} = 2 ‖Γ (k) \cdot E (k) \cdot {H (k)}^{T} \cdot \tilde{W} (k)‖ + {‖Γ (k) \cdot E (k) \cdot {H (k)}^{T}‖}^{2} = {‖Γ (k)‖}^{2} \cdot {‖E (k)‖}^{2} \cdot {‖{H (k)}^{T}‖}^{2} + 2 ‖Γ (k) \cdot E (k) \cdot {H (k)}^{T} \cdot Φ^{- 1} [(E (k) - Φ [H_{d} (k), W_{d}^{*}]), {H (k)}^{T}]‖ = {‖Γ (k)‖}^{2} \cdot {‖E (k)‖}^{2} \cdot {‖{H (k)}^{T}‖}^{2} + 2 ‖Γ (k)‖ \cdot ‖E (k)‖ \cdot ‖{H (k)}^{T}‖ \cdot ‖Φ^{- 1} [(E (k) - Φ [H_{d} (k), W_{d}^{*}]), {H (k)}^{T}]‖ ∆ L_{k} \leq {ζ (k) \cdot ‖E (k)‖}^{2} + δ (k) \cdot ‖E (k)‖ \cdot ‖Φ^{- 1} [(E (k) - Φ [H_{d} (k), W_{d}^{*}]), {H (k)}^{T}]‖ ∆ L_{k} \leq {ζ (k) \cdot ‖E (k)‖}^{2} + δ (k) \cdot ‖E (k)‖ \cdot ‖Φ^{- 1} [(E (k) - μ_{d}), {H (k)}^{T}]‖

(131)

where

ζ (k)

and

δ (k)

are defined as,

ζ (k) = {{‖Γ (k)‖}^{2} \cdot ‖{H (k)}^{T}‖}^{2} δ (k) = 2 ‖Γ (k)‖ \cdot ‖{H (k)}^{T}‖

Because

n \min ({\tilde{w}}_{i}^{2}) \leq L_{k} \leq n \max ({\tilde{w}}_{i}^{2})

where

n \min ({\tilde{w}}_{i}^{2})

and

n \max ({\tilde{w}}_{i}^{2})

are

K_{\infty} - f u n c t i o n s

;

{ζ (k) \cdot ‖E (k)‖}^{2}

is a

K_{\infty} - f u n c t i o n

; and

δ (k) \cdot ‖E (k)‖ \cdot ‖Φ^{- 1} [(E (k) - μ_{d}), {H (k)}^{T}]‖

is a

K - f u n c t i o n

. Thus,

L_{k}

admits an ISS-Lyapunov function [31], and the dynamic of the identification error is input-to-state stable.

From (120) and (129) we know

L_{k}

is the function of

E (k)

and

Φ [H_{d} (k), W_{d}^{*}]

. The INPUT and the STATE correspond to both terms of (131). However, usually,

Φ [H_{d} (k), W_{d}^{*}] ≪ Φ [H (k) {, W}^{*}]

.

Because the INPUT is bounded and the dynamic is ISS, therefore the STATE

E (k)

is bounded.

Applying the bounded conditions for

W_{N} (k + 1)

and

Γ_{N} (k + 1)

, Equation (131), from 1 up to T and using

{0 < L}_{T}

and

L_{1}

is a constant,

ζ_{N} (k) \cdot (\sum_{k = 1}^{T} {‖E_{N} (k)‖}^{2}) + δ_{N} (k) \cdot (\sum_{k = 1}^{T} ‖E_{N} (k)‖) \cdot ‖Φ^{- 1} [(E (k) - {\bar{μ}}_{d}), {H (k)}^{T}]‖ {\leq L}_{T} - L_{1} ζ_{N} (k) = {{‖Γ_{N} (k)‖}^{2} \cdot ‖{H (k)}^{T}‖}^{2} δ_{N} (k) = 2 ‖Γ_{N} (k)‖ \cdot ‖{H (k)}^{T}‖ ζ_{N} (k) \cdot (\sum_{k = 1}^{T} {‖E_{N} (k)‖}^{2}) \leq L_{T} - L_{1} - δ_{N} (k) \cdot (\sum_{k = 1}^{T} ‖E_{N} (k)‖) \cdot ‖Φ^{- 1} [(E (k) - {\bar{μ}}_{d}), {H (k)}^{T}]‖

(126) is established. □

Remark 3.

It is complex to obtain high modeling accuracy for the classical fuzzy neural networks due to the difficulty in deciding the hyperparameters of the fuzzy neural systems. However, our fuzzy system with adaptive neurons has fewer hyperparameters to be chosen, and we prove that modeling error converges to the zone

{\bar{μ}}_{d}

.

Remark 4.

If the fuzzy system (2) could match the nonlinear plant (1) exactly (

μ_{d} (k) = 0)

, i.e., we could find the best membership function

Φ_{H}

and

W^{*}

such that the nonlinear system could be written as

Y (k) = Φ_{H} [H (k), W^{*}]

, the same learning algorithm makes the identified error

‖E (k)‖

asymptotically stable:

\lim_{K \to \infty} ‖E (k)‖ = 0

(132)

Remark 5.

The normalization of learning rates in (120) and (124) are time-varying to insure the stability of identification error. Because the initial condition does not need any previous information, the time-varying learning rates usually are robust.

5. Results of the Simulation of the Fuzzy Systems

5.1. Recognition of Spatiotemporal Spike Patterns

For each reference spike input pattern or sequence that could be presented to the system, axonal delays of each spike and channel are identified. Subsequently, the pattern recognition of the reference spike sequences is performed with the ANFIS and FAN-SPKAF methods of different inputs to the system consisting of noise spike sequences. The spike pattern reference sequences, five input channels, are shown in Figure 5.

The initial conditions for ANFIS method are as follows:

○: Weights $ϵ$ [0,1],

p_{i A N F I S} (k) = 0, q_{i A N F I S} (k) = 0, r_{i A N F I S} (k) = 0, s_{i A N F I S} (k) = 0 .

○: Learning factor, fixed value, $γ_{F A N} (k) = 1$ .

○: $a = 11, b = 6, g_{1} = 9 .$

The initial conditions for FAN-SPKAF method are as follows:

○: Weights $ϵ$ [0,1],

w_{i n F A N - S P K A F} (k) = 1

○: Learning factor, fixed value, $γ_{F A N} (k) = 1$ .

○: Threshold, fixed value, $V_{t h r e s h o l d F A N} (k) = 0$ .

○: $a = 2, b = 2, c_{1} = 0.09, c_{2} = 0.5, g_{2} = 0.4 .$

The output of the MIN operator, together with the spike response of the ANFIS and FAN-SPKAF methods, are shown in Figure 6.

Figure 7a–e describes the five noisy input spike sequences, with target spike patterns indicated by dotted lines. Figure 7f presents the result of the MIN operator, the target pattern of the fuzzy system, and the output of the FAN-SPKAF. We can notice that the error is practically null.

The results obtained in Figure 7 based on the FAN-SPKAF method, compared to the spike response of ANFIS and other methods [15], are quite precise because with the ANFIS method the response presents additional spikes, while with the FAN-SPKAF method the pattern recognition is successful, and the spikes coincide with the reference sequences for the recognition of spatiotemporal spike patterns, thus reducing the error.

5.2. Syllable-Based Spike Pattern Speech Recognition

Taking a sample of each syllable of the speech of each one of the musical notes of solfeggio in Spanish language, the first notes of “Ode to Happiness” are composed. Subsequently, the musical notes SI are identified. Environmental noise is considered null.

5.2.1. Results of the FAN-STEPAF-SPKAF Method

In Figure 8, the results obtained from the simulation of the fuzzy system for syllable-based spike pattern speech recognition are shown. The spike pattern of the musical note SI

{\tilde{y}}_{S P K A F S I} (k)

is the syllable to be identified from the speech sample in Figure 8a. From Figure 8a the FAN-STEPAF is applied to the speech sample, obtaining the pulse train proportional in duration and frequency to the amplitude of the voice sample. Then the SPKAF generates the spike trains, shown in Figure 8b.

Figure 8d shows the spike patterns of the SI syllable

{\tilde{y}}_{S P K A F S I} (k)

. With the fuzzy system of Figure 4, the results show that the recognition was performed successfully. Although in second 14–15 it is observed that the model recognized three continuous SI, there is only one Spanish speaker; to obtain three overlapping and consecutive SI, three Spanish speakers are required simultaneously. In Figure 8,

{\tilde{y}}_{S P K A F I D E N T I F I E D}

shows two additional spikes because the musical notes SI and LA coincide at time and magnitude around second 14.7.

5.2.2. Results of the Augmented Spiking Neuron Model Method

Normalized magnitude to [0,1] of the melody with musical notes of solfeggio is shown in Figure 9a; the train of square pulses (37) are shown in Figure 9b; the train of spikes based on the Augmented Spiking Neuron Models (41) and (42) are shown in Figure 9c; and in Figure 9d the recognized pattern of the syllable SI with methods (43)–(48) is shown.

The results obtained with the ASNM method after signal processing of the amplitude of the speech sample applying the STFT and (24)–(26), Figure 9, are satisfactory because a more accurate pattern recognition is obtained, and, therefore, the error is reduced.

5.2.3. Results of the Augmented FAN-STEPAF-SPKAF Method

Figure 10a shows the normalized magnitude of the melody with musical notes of solfeggio; Figure 10b shows the square pulse train proportional in frequency to the amplitude of the normalized magnitude of the melody; Figure 10c shows the train of

{\tilde{y}}_{s a w t o o t h}

(49) after signal processing with STFT of the normalized magnitude of the melody; Figure 10d shows the train of spikes based on the Augmented FAN-STEPAF-SPKAF model (55); and Figure 10e shows the recognized pattern of the syllable SI with (56)–(61).

The FAN-STEPAF-SPKAF model was augmented by including an input

{\tilde{y}}_{s a w t o o t h}

and spike coefficients to obtain augmented spikes, successful pattern recognition, and much more precise results in Figure 10. Errors were further reduced.

5.3. MSE

Table 1 presents the MSE and Figure 11 the error of spike pattern recognition of the results obtained from the FAN-STEPAF-SPKAF, Augmented Spiking Neuron Model, and Augmented FAN-STEPAF-SPKAF methods.

The comparison of the errors and the percentage of precision of coincidence of syllable pattern recognition shows that the Augmented Spiking Neuron Model method has the lowest percentage of precision of coincidence, and the FAN-STEPAF-SPKAF and Augmented FAN-STEPAF-SPKAF methods have the lowest MSE; therefore, in Figure 11 and Table 1, the superiority of the proposed Augmented FAN-STEPAF-SPKAF method is observed.

6. Conclusions and Discussion

The proposed fuzzy systems for the recognition of spatiotemporal spike patterns ANFIS and FAN-SPKAF were compared. The ANFIS method achieved good results but with noise. Stability analysis of the system model was performed with great success. The results obtained with the FAN-SPKAF method exceeded expectations because the noise decreased.

FAN-SPKAF produces a spike response, recognizing the spike pattern of the SI syllable of the reference sequence of the melody with solfeggio musical notes. To generate a spike response, we could also opt for the spike response model developed in [5,10,11]; however, the new SPKAF is a simpler model.

The fuzzy system with the FAN-STEPAF-SPKAF model for syllable-based speech recognition developed in this article produced very good results, considering optimal conditions, null ambient noise, and constant speech characteristics such as articulation, timbre, and tone. Comparing the Augmented Spiking Neuron Model and Augmented FAN-STEPAF-SPKAF methods, the latter model has the advantage that pattern recognition is performed with a minimum of MSE, although all methods require a reference pattern.

For future work, it is proposed to identify the corresponding axonal delays with the FAN and its application in control [29,30,35], unmanned aerial vehicles (UAV) [34,42], and modeling of seismic events [31], among other areas of knowledge [5,43,44,45,46,47,48,49,50,51,52,53,54].

Author Contributions

Conceptualization, A.M.E.R.-M., W.Y. and X.L.; Methodology, A.M.E.R.-M.; Software, A.M.E.R.-M.; Validation, A.M.E.R.-M.; Formal analysis, A.M.E.R.-M., W.Y. and X.L.; Investigation, A.M.E.R.-M., W.Y. and X.L.; Resources, A.M.E.R.-M., W.Y. and X.L.; Data curation, A.M.E.R.-M.; Writing—original draft, A.M.E.R.-M.; Writing—review & editing, A.M.E.R.-M.; Visualization, A.M.E.R.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data Availability Statement

Data sharing not applicable.

Acknowledgments

The Department of Automatic Control, the Computer Department, and the Department of Autonomous Air and Underwater Navigation Systems, French Mexican Laboratory of Informatics and Automatic Control, Mixed International Unit (LAFMIA UMI for its acronym in Spanish) of the Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN for its acronym in Spanish), and the Postdoctoral Program of the National Council of Science and Technology (CONACYT for its acronym in Spanish).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

ANFIS	adaptive network-based fuzzy inference system
FAN	fuzzy adaptive neuron
FAN-SPKAF	fuzzy adaptive neuron with spike activation function
FAN-STEPAF-SPKAF	fuzzy adaptive neuron–step activation function–spike activation function
LIF	leaky integrate-and-fire
MSE	mean squared error
NARMA	nonlinear autoregressive-moving average
Nomenclature
SAF	sigmoid activation function
SNN	spiking neural networks
SPKAF	spike activation function
STEPAF	step activation function
STFT	short-time Fourier transform
UAV	unmanned aerial vehicles
$a, b, c$	real numbers
$c_{1} = 0.09, c_{2} = 0.5$	real numbers
$c_{i}^{j}$	spike coefficient
$c_{3}$	augmented constant
$d (t)$	unmodeled dynamic
$e (k) \in R^{m \times 1}$	error of identification or modeling error
$e_{s p i k e A N F I S} (k)$	modeling error
$e_{S P K A F I D E N T I F I E D} (k)$	modeling error
$e_{s p i k e F A N - S P K A F I D E N T I F I E D} (k)$	modeling error
$Φ [\cdot]$	nonlinear function
$γ (k)$	learning factor, $0 < γ \leq 1$
$g_{1}$	gain
$γ (k)$	learning factor, $0 < γ \leq 1$
$η$	learning rate, $η = 1$
$H (k)$	inputs of the nonlinear plant
$H_{d} (k)$	inputs of the unmodeled dynamic
$h_{i n} (k)$	input
$k$	time
$k_{i}^{j}$	$j t h$ input spike time from the $i t h$ afferent
$k_{s}^{j}$	time of the $j t h$ output spike
$κ_{m}$	time constant
$κ_{s}$	time constant of the synaptic currents
$M$	number of ${\tilde{y}}_{A S N M S I}$ or ${\tilde{y}}_{A S P K A F S I}$ samples
$M S E$	mean squared error
$N$	number of synaptic afferents
$n = 0, 1, \dots, N - 1$	frequency index
$p_{j}, {q_{j}, r_{j}, s}_{j}$	adaptive parameters
$P$	target pattern
$q_{0}$	constant parameter that normalizes the peak of the kernel to unity
$Q$	number of ${\tilde{y}}_{A S N M}$ or ${\tilde{y}}_{A S P K A F}$ samples
$r$	number, $r \in R$
$ϑ$	firing threshold, $ϑ = 0$
${\tilde{V}}_{o u t γ} (k)$	$operator m i n$ with $γ (k)$
$V_{t h r e s h o l d j} (k)$	threshold
$w_{i}$	synaptic weight
$w_{i n j} (k)$	synaptic weights, $p_{j}, {q_{j}, r_{j}, s}_{j}$
${\bar{ω}}_{j}$	normalized firing strength
$w (k)$	weights, proposed fixed weights or $w (k + 1) = w (k) + Δ w (k)$ $, γ (k + 1) = γ (k) + Δ γ (k)$
$W (k)$	weights of the proposed model
$W_{d}^{*}$	weights of the unmodeled dynamic
$W^{*}$	unknown weights to minimize unmodeled dynamic $Φ [H_{d} (k), W_{d}^{*}]$
$w [m - k]$	window sequence to select a finite-length (local) segment from and possibly to reduce the spectral leakage
$x_{j} (k - {Δ k}_{j l a x o n d e l a y})$	dendrite inputs
$x [m]$	sliding sequence
$Y (k)$	output of the nonlinear plant
$\hat{Y} (k)$	output of the proposed model
${\tilde{y}}_{S A F j} (k)$	SAF
${\tilde{y}}_{S P K A F} (k)$	SPKAF
$y_{s p i k e r e f} (k)$	reference
${\tilde{y}}_{S P K A F S I} (k)$	$y_{s p i k e r e f} (k)$
${\tilde{y}}_{S P K A F I D E N T I F I E D} (k)$	identified spikes voice pattern
$y_{s p i k e A N F I S r e f} (k)$	$y_{s p i k e r e f} (k)$
${\tilde{y}}_{s p i k e A N F I S} (k)$	identified spikes voice pattern
$y_{s p i k e F A N - S P K A F r e f} (k)$	$y_{s p i k e r e f} (k)$
${\tilde{y}}_{s p i k e F A N - S P K A F I D E N T I F I E D} (k)$	SPKAF, identified spikes voice pattern
${\tilde{y}_{A S N M S I} (k_{1}) \dots {\tilde{y}}_{A S N M S I} (k_{M})}$	spikes voice pattern of the syllable or musical note SI of solfeggio in Spanish
${\tilde{y}_{A S P K A F S I} (k_{1}) \dots {\tilde{y}}_{A S P K A F S I} (k_{M})}$	spikes voice pattern of the syllable or musical note SI of solfeggio in Spanish
${\tilde{y}}_{A S N M I D E N T I F I E D} (k_{i})$	identified spikes voice pattern
${\tilde{y}}_{A S P K A F I D E N T I F I E D} (k_{i})$	identified spikes voice pattern
$y_{t r i a n g u l a r} (k)$	triangular signal with a peak amplitude of 0.1 with an added constant of 0.8 and period $K_{t r i a n g u l a r} = \frac{K_{s t e p}}{2}$
$z_{i n j} (k)$	dendrite inputs

References

McCulloch, W.S.; Pitts, W.H. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
Hodgkin, A.L.; Huxley, A.F. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 1952, 117, 500–544. [Google Scholar] [CrossRef] [PubMed]
Gerstner, W.; Kistler, W.M. Spiking Neuron Models; Cambridge University Press: Cambridge, UK, 2002; 480p, ISBN 978-0521890793. [Google Scholar]
Izhikevich, E.M. Simple model of spiking neurons. IEEE Trans. Neural Netw. 2003, 14, 1569–1572. [Google Scholar] [CrossRef] [PubMed]
Ramírez-Mendoza, A.M.E. Modeling the Spike Response for Adaptive Fuzzy Spiking Neurons with Application to a Fuzzy XOR. Comput. Model. Eng. Sci. 2018, 115, 295–311. [Google Scholar] [CrossRef]
Zadeh, L.A. Theory of Fuzzy Sets. In Encyclopedia of Computer Science and Technology; Marcel Dekker: New York City, NY, USA, 1977. [Google Scholar]
Gupta, M.M.; Qi, J. On Fuzzy Neuron Models. In Fuzzy Logic for the Management of Uncertainty; Zadeh, L.A., Kacprzyk, J., Eds.; Wiley-Interscience: Hoboken, NJ, USA, 1992; pp. 479–491. ISBN 0-471-54799-9. [Google Scholar]
Gupta, M.M. Fuzzy logic, neural networks and virtual cognitive systems. In Second International Symposium on Uncertainty Modeling and Analysis; IEEE: Piscataway, NJ, USA, 1993; pp. 90–97. [Google Scholar] [CrossRef]
Gupta, M.M.; Rao, D.H. On the principles of fuzzy neural networks. In Fuzzy Set Syst; Elsevier Science B. V.: Amsterdam, The Netherlands, 1994; Volume 61, pp. 1–18. [Google Scholar]
Ramírez, A.; Pérez, J.L. A fuzzy Gupta integrator neuron model with spikes response and axonal delay. Adv. Artif. Intell. Eng. Cybern. 2002, 9, 12–16. [Google Scholar]
Ramírez-Mendoza, A.; Pérez-Silva, J.L.; Lara-Rosano, F. Electronic Implementation of a Fuzzy Neuron Model with a Gupta Integrator. JART 2011, 9, 380–393. [Google Scholar] [CrossRef]
Zhang, L. Building logistic spiking neuron models using analytical approach. IEEE Access 2019, 7, 80443–80452. [Google Scholar] [CrossRef]
Kubota, N.; Nishida, K. The Role of Spiking Neurons for Visual Perception of a Partner Robot. In Proceedings of the International Conference on Fuzzy Systems Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada, 16–21 July 2006; pp. 122–129, IEEE 0-7803-9489-5/06. [Google Scholar]
Yu, Q.; Tang, H.; Tan, K.C.; Li, H. Rapid feedforward computation by temporal encoding and learning with spiking neurons. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 1539–1552. [Google Scholar] [CrossRef]
Tapson, J.C.; Cohen, G.K.; Afshar, S.; Stiefel, K.M.; Buskila, Y.; Wang, R.M.; Hamilton, T.J.; Schaik, A.V. Synthesis of neural networks for spatio-temporal spike pattern recognition and processing. Front. Neurosci. Neuromorphic Eng. 2013, 7, 153. [Google Scholar] [CrossRef]
Susi, G.; Toro, L.A.; Canuet, L.; López, M.E.; Maestú, F.; Mirasso, C.R.; Pereda, E.A. Neuro-Inspired System for Online Learning and Recognition of Parallel Spike Trains, Based on Spike Latency, and Heterosynaptic STDP. Front. Neurosci. 2018, 12, 780. [Google Scholar] [CrossRef]
Cheng, L.; Liu, Y.; Hou, Z.-G.; Tan, M.; Du, D.; Fei, M. A rapid spiking neural network approach with an application on hand gesture recognition. IEEE Trans. Cogn. Dev. Syst. 2019, 13, 151–161. [Google Scholar] [CrossRef]
Shi, M.; Zhang, T.; Zeng, Y. A Curiosity-Based Learning Method for Spiking Neural Networks. Front. Comput. Neurosci. 2020, 14, 7. [Google Scholar] [CrossRef] [PubMed]
Lee, C.; Sarwar, S.S.; Panda, P.; Srinivasan, G.; Roy, K. Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures. Front. Neurosci. 2020, 14, 119. [Google Scholar] [CrossRef] [PubMed]
Shimaila, N.K.V. Generation of Future Image Frames using Adaptive Network Based Fuzzy Inference System on Spatiotemporal Framework. In Proceedings of the IEEE Applied Imagery Pattern Recognition Workshop (AIPR), Washington, DC, USA, 9–11 October 2012; pp. 1–8. [Google Scholar] [CrossRef]
Kasabov, N. Evolving Spiking Neural Networks and Neurogenetic Systems for Spatio- and Spectro- Temporal Data Modelling and Pattern Recognition. In Proceedings of the IEEE World Congress on Computational Intelligence, Advances in Computational Intelligence, LNCS 7311, Brisbane, QLD, Australia, 10–15 June 2012; pp. 234–260. [Google Scholar] [CrossRef]
Yu, Q.; Goh, S.K.; Tang, H.; Tan, K.C. Application of Precise-Spike-Driven Rule in Spiking Neural Networks for Optical Character Recognition. In Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems—Volume 2; Proceedings in Adaptation, Learning and Optimization; Springer: Berlin/Heidelberg, Germany, 2015; Volume 2, pp. 65–75. [Google Scholar] [CrossRef]
Dhilipan, A.; Preethi, J. Pattern Recognition using Spiking Neural Networks with Temporal Encoding and Learning. In Proceedings of the IEEE Sponsored 9th International Conference on Intelligent Systems and Control (ISCO), Coimbatore, India, 9–10 January 2015; pp. 1–5. [Google Scholar]
Zhang, Z.; Liu, Q. Spike-Event-Driven Deep Spiking Neural Network with Temporal Encoding. IEEE Signal Process. Lett. 2021, 28, 484–488. [Google Scholar] [CrossRef]
Yu, Q.; Song, S.; Ma, C.; Pan, L.; Tan, K.C. Synaptic Learning with Augmented Spikes. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 1134–1146. [Google Scholar] [CrossRef]
Zhao, J.; Bose, B.K. Evaluation of membership functions for fuzzy logic controlled induction motor drive. In Proceedings of the IEEE 2002 28th Annual Conference of the Industrial Electronics Society, IECON 02, Sevilla, Spain, 5–8 November 2002; pp. 229–234. [Google Scholar] [CrossRef]
Basterretxea, K.; Tarela, J.M.; Del Campo, I. Digital Gaussian membership function circuit for neuro-fuzzy hardware. Electron. Lett. 2006, 42, 44–46. [Google Scholar] [CrossRef]
Xie, W.; Sang, S.; Lam, H.-K.; Zhang, J. A polynomial-membership-function approach for stability analysis of fuzzy systems. IEEE Trans. Fuzzy Syst. 2020, 29, 2077–2087. [Google Scholar] [CrossRef]
Ramírez-Mendoza, A.M.E.; Yu, W. A novel learning algorithm for Fuzzy Adaptive Neural Networks: Application to the neuro-fuzzy design of control law for a PID controller. submitted.
Mendoza, A.M.E.R.; Yu, W. Fuzzy Adaptive Control Law for Trajectory Tracking Based on a Fuzzy Adaptive Neural PID Controller of a Multi-rotor Unmanned Aerial Vehicle. Int. J. Control Autom. Syst. 2023, 21, 658–670. [Google Scholar] [CrossRef]
Ramírez-Mendoza, A.M.E.; Yu, W.; Li, X. A novel fuzzy system with adaptive neurons for earthquake modeling. IEEE Access 2020, 8, 101369–101376. [Google Scholar] [CrossRef]
Ramírez-Mendoza, A.M.E.; Yu, W.; Li, X. Fuzzy Identification of Systems based on Adaptive Neurons. J. Intell. Fuzzy Syst. 2021, 40, 10767–10779. [Google Scholar] [CrossRef]
Ramírez-Mendoza, A. Study of the response of the connection of Fuzzy Adaptive Spiking Neurons with self-synapse in each single neuron. In Proceedings of the 11th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), Mexico City, Mexico, 9–11 November 2014; pp. 1–6, ISBN 978-1-4799-6228-0. [Google Scholar]
Ramírez-Mendoza, A.; Covarrubias-Fabela, J.; Amezquita-Brooks, L.; Hernández-Alcantara, D. Parameter Identification using Fuzzy Neurons: Application to Drones and Induction Motors. DYNA 2018, 93, 75–81. [Google Scholar] [CrossRef]
Ramirez-Mendoza, A.M.E.; Covarrubias-Fabela, J.R.; Amezquita-Brooks, L.A.; Garcia-Salazar, O.; Yu, W. Fuzzy Adaptive Neurons applied to the identification of parameters and trajectory tracking control of a multi-rotor Unmanned Aerial Vehicle based on experimental aerodynamic data. J. Intell. Robot. Syst. 2020, 100, 647–665. [Google Scholar] [CrossRef]
Fujimoto, J.-I.; Nakatani, T.; Yoneyama, M. Speaker-Independent Word Recognition Using Fuzzy Pattern Matching. Fuzzy Sets Syst. 1989, 32, 181–191. [Google Scholar] [CrossRef]
Ray, K.S.; Ghoshal, J. Approximate reasoning approach to pattern recognition. Fuzzy Sets Syst. 1996, 77, 125–150. [Google Scholar] [CrossRef]
Klasmeyer, G. The perceptual importance of selected voice quality parameters. In Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, 21–24 April 1997; Volume 3, pp. 1615–1618. [Google Scholar] [CrossRef]
Mahar, J.A.; Qadir, G.; Abbass, H. Perception of syllables pitch contour in Sindhi language. In Proceedings of the 2009 International Conference on Natural Language Processing and Knowledge Engineering, Dalian, China, 24–27 September 2009; pp. 1–6. [Google Scholar] [CrossRef]
Aihara, R.; Takashima, R.; Takiguchi, T.; Ariki, Y. Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 8037–8040. [Google Scholar] [CrossRef]
Chleboun, J. A new membership function approach to uncertain functions. Fuzzy Sets Syst. 2020, 387, 68–80. [Google Scholar] [CrossRef]
Amezquita-Brooks, L.A.; Hernández-Alcántara, D.; Santana-Delgado, C.; Covarrubias-Fabela, J.R.; García-Salazar, O.; Ramírez-Mendoza, A.M.E. Improved model for micro-UAV propulsion systems: Characterization and applications. IEEE Trans. Aerosp. Electron. Syst. 2019, 56, 2174–2197. [Google Scholar] [CrossRef]
Li, X.; Yu, W.; Lara-Rosano, F. Dynamic Knowledge Inference and Learning under Adaptive Fuzzy Petri Net Framework. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2000, 30, 442–450. [Google Scholar]
Yu, W.; Li, X. Fuzzy identification using fuzzy neural networks with stable learning algorithms. IEEE Trans. Fuzzy Syst. 2004, 12, 411–420. [Google Scholar] [CrossRef]
Medina-Santiago, A.; Azucena, A.D.P.; Gómez-Zea, J.M.; Jesús-Magaña, J.A.; Valdez-Ramos, M.D.L.L.; Sosa-Silva, E.; Falcón-Pérez, F. Adaptive Model IoT for Monitoring in Data Centers. IEEE Access 2020, 8, 5622–5634. [Google Scholar] [CrossRef]
Wang, J.; Xu, C.; Yang, Z.; Zhang, J.; Li, X. Deformable Convolutional Networks for Efficient Mixed-Type Wafer Defect Pattern Recognition. IEEE Trans. Semicond. Manuf. 2020, 33, 587–596. [Google Scholar] [CrossRef]
Lele, A.; Fang, Y.; Ting, J.; Raychowdhury, A. An End-to-End Spiking Neural Network Platform for Edge Robotics: From Event-Cameras to Central Pattern Generation. IEEE Trans. Cogn. Dev. Syst. 2022, 14, 1092–1103. [Google Scholar] [CrossRef]
Ramírez-Mendoza, A.M.E.; Yu, W. Simplified model of the propulsion system for a PVTOL with a disturbance and estimate of power efficiency. DYNA 2022, 97, 470–474. [Google Scholar] [CrossRef] [PubMed]
Muñoz, F.; Cervantes-Rojas, J.S.; Valdovinos, J.M.; Sandre-Hernández, O.; Salazar, S.; Romero, H. Dynamic Neural Network-Based Adaptive Tracking Control for an Autonomous Underwater Vehicle Subject to Modeling and Parametric Uncertainties. Appl. Sci. 2021, 11, 2797. [Google Scholar] [CrossRef]
Steccanella, L.; Bloisi, D.D.; Castellini, A.; Farinelli, A. Waterline and obstacle detection in images from low-cost autonomous boats for environmental monitoring. Robot. Auton. Syst. 2020, 124, 103346. [Google Scholar] [CrossRef]
Bergies, S.; Su, S.-F.; Elsisi, M. Model Predictive Paradigm with Low Computational Burden Based on Dandelion Optimizer for Autonomous Vehicle Considering Vision System Uncertainty. Mathematics 2022, 10, 4539. [Google Scholar] [CrossRef]
Quah, T.K.N.; Tay, Y.W.D.; Lim, J.H.; Tan, M.J.; Wong, T.N.; Li, K.H.H. Concrete 3D Printing: Process Parameters for Process Control, Monitoring and Diagnosis in Automation and Construction. Mathematics 2023, 11, 1499. [Google Scholar] [CrossRef]
Helander, E.; Virtanen, T.; Nurminen, J.; Gabbouj, M. Voice Conversion Using Partial Least Squares Regression. IEEE Trans. Audio Speech Lang. Process. 2010, 18, 912–921. [Google Scholar] [CrossRef]
Helander, E.; Silén, H.; Virtanen, T.; Gabbouj, M. Voice Conversion Using Dynamic Kernel Partial Least Squares Regression. IEEE Trans. Audio Speech Lang. Process. 2012, 20, 806–817. [Google Scholar] [CrossRef]

Figure 1. Spike membership function

Y_{s p i k e} = {\tilde{y}}_{S P K A F} (k)

.

Figure 1. Spike membership function

Y_{s p i k e} = {\tilde{y}}_{S P K A F} (k)

.

Figure 2. ANFIS fuzzy system for the recognition of spatiotemporal spike patterns.

Figure 3. Fuzzy system for the recognition of spatiotemporal spike patterns.

Figure 4. Fuzzy system with FAN-SPKAF for syllable-based speech recognition.

Figure 5. Reference sequences for the recognition of spatiotemporal spike patterns.

Figure 6. Output of the MIN operator, the spike response of the ANFIS and FAN-SPKAF methods.

Figure 7. (a–e) Noisy input vs. input spike sequences for the recognition of spatiotemporal spike patterns. (f) Output or spike response from the ANFIS and FAN-SPKAF fuzzy systems.

Figure 8. (a)

X_{M A X m a g} [k]

normalized magnitude. (b)

{\tilde{y}}_{s t e p} (k)

. (c)

{\tilde{y}}_{S P K A F} (k)

. (d)

{\tilde{y}}_{S P K A F I D E N T I F I E D} (k)

.

Figure 8. (a)

X_{M A X m a g} [k]

normalized magnitude. (b)

{\tilde{y}}_{s t e p} (k)

. (c)

{\tilde{y}}_{S P K A F} (k)

. (d)

{\tilde{y}}_{S P K A F I D E N T I F I E D} (k)

.

Figure 9. (a)

X_{M A X m a g} [k]

normalized magnitude. (b)

{\tilde{y}}_{i n} (k)

. (c)

{\tilde{y}}_{A S N M} (k)

. (d)

{\tilde{y}}_{A S N M I D E N T I F I E D} (k)

.

Figure 9. (a)

X_{M A X m a g} [k]

normalized magnitude. (b)

{\tilde{y}}_{i n} (k)

. (c)

{\tilde{y}}_{A S N M} (k)

. (d)

{\tilde{y}}_{A S N M I D E N T I F I E D} (k)

.

Figure 10. (a)

X_{M A X m a g} [k]

normalized magnitude. (b)

{\tilde{y}}_{s t e p} (k)

. (c)

{\tilde{y}}_{s a w t o o t h} (k)

. (d)

{\tilde{y}}_{A S P K A F} (k)

. (e)

{\tilde{y}}_{A S P K A F I D E N T I F I E D} (k)

.

Figure 10. (a)

X_{M A X m a g} [k]

normalized magnitude. (b)

{\tilde{y}}_{s t e p} (k)

. (c)

{\tilde{y}}_{s a w t o o t h} (k)

. (d)

{\tilde{y}}_{A S P K A F} (k)

. (e)

{\tilde{y}}_{A S P K A F I D E N T I F I E D} (k)

.

Figure 11. (a)

{E r r o r}_{S P K A F I D E N T I F I E D} (k)

. (b)

{E r r o r}_{A S N M I D E N T I F I E D} (k)

. (c)

{E r r o r}_{A S P K A F I D E N T I F I E D} (k)

.

Figure 11. (a)

{E r r o r}_{S P K A F I D E N T I F I E D} (k)

. (b)

{E r r o r}_{A S N M I D E N T I F I E D} (k)

. (c)

{E r r o r}_{A S P K A F I D E N T I F I E D} (k)

.

Table 1. MSE.

MODEL	MSE	Precision (%)
FAN-STEPAF-SPKAF	1.0823 × 10⁻⁴	99.99
Augmented Spiking Neuron Model	3.6915 × 10⁻⁴	99.96
Augmented FAN-STEPAF-SPKAF	1.6543 × 10⁻⁷	99.999983457

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ramírez-Mendoza, A.M.E.; Yu, W.; Li, X. A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application. Mathematics 2023, 11, 2525. https://doi.org/10.3390/math11112525

AMA Style

Ramírez-Mendoza AME, Yu W, Li X. A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application. Mathematics. 2023; 11(11):2525. https://doi.org/10.3390/math11112525

Chicago/Turabian Style

Ramírez-Mendoza, Abigail María Elena, Wen Yu, and Xiaoou Li. 2023. "A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application" Mathematics 11, no. 11: 2525. https://doi.org/10.3390/math11112525

APA Style

Ramírez-Mendoza, A. M. E., Yu, W., & Li, X. (2023). A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application. Mathematics, 11(11), 2525. https://doi.org/10.3390/math11112525

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Spike Membership Function for the Recognition and Processing of Spatiotemporal Spike Patterns: Syllable-Based Speech Recognition Application

Abstract

1. Introduction

Related Works

2. Methods

2.1. Fuzzy Adaptive Neurons

2.2. Spike Activation Function Model

3. Models for the Processing of Spike-Time-Encoded Information and Pattern Recognition

3.1. ANFIS Model

3.1.1. Algorithm

3.1.2. Adaptive Network Based Fuzzy Inference System (ANFIS)

3.2. FAN-SPKAF Model

3.2.1. Algorithm

3.2.2. Fuzzy Adaptive Neuron with Spike Activation Function (FAN-SPKAF)

3.3. FAN-STEPAF-SPKAF Model for Syllable-Based Speech Recognition Application

3.4. Augmented Spiking Neuron Model for Syllable-Based Speech Recognition Application

3.5. Augmented FAN-STEPAF-SPKAF Model for Syllable-Based Speech Recognition Application

4. Model Analysis

4.1. Nonlinear Systems Fuzzy Modeling

4.1.1. Analysis of ANFIS Model

4.1.2. Analysis of FAN-SPKAF model

4.1.3. Analysis of FAN-STEPAF-SPKAF Model

4.2. Fuzzy System Training

4.3. Stability Analysis

5. Results of the Simulation of the Fuzzy Systems

5.1. Recognition of Spatiotemporal Spike Patterns

5.2. Syllable-Based Spike Pattern Speech Recognition

5.2.1. Results of the FAN-STEPAF-SPKAF Method

5.2.2. Results of the Augmented Spiking Neuron Model Method

5.2.3. Results of the Augmented FAN-STEPAF-SPKAF Method

5.3. MSE

6. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI