3.1. Possibilistic Fuzzy CMeans Method
Let us consider clustering of a given dataset
$X=\left(\right)open="\{"\; close="\}">{x}_{1},{x}_{2},...,{x}_{N}$ (where
${x}_{i}$ is the
ith data instance and
p is the number of features describing instances) as a partitioning of
X into
$1<c<N$ subgroups such that each subgroup represents "natural" substructure in
X. In [
8], this partition of
X, denoted as
M, is a set of
$2\times c\times N$ values that can be conveniently arrayed as two matrices
$U={\left(\right)}_{{u}_{ij}}c\times N$ and
$T={\left(\right)}_{{t}_{ij}}c\times N$:
Equation (
1) defines the set of possibilistic fuzzy partition of
X. Here,
U is the fuzzy matrix while
T is the possibilistic one. Both matrices are mathematically identical, having entries between 0 and 1. These matrices are interpreted according to their entries. For the fuzzy matrix,
U, each entry
${u}_{ij}$ is taken as the membership of
${x}_{j}$ in the
ith cluster of
M, and for the probabilistic matrix each entry
${t}_{ij}$ is usually the (posterior) probability
$p\left(i\right{x}_{j})$ that, given
${x}_{j}$, it came from cluster
i. It should be noted that
${t}_{ij}$ is a measure that shows how typical a given entry is for
ith cluster.
In [
71], the authors explained why both possibilistic and fuzzy matrices should be used for clustering and in [
8] they also listed the disadvantages of both Fuzzy
CMeans (FCM) [
72] and Possibilistic
CMeans (PCM) [
73] methods. Originally, there was a constraint on the typicality values (the sum of the typicality values over all data points in a particular cluster was set to 1). Thus, in [
8] the authors relax this constraint, namely, in their algorithm, the row sum of typicality values is equal to 1, but the column constraint on the membership values is retained. This leads to the following optimization problem:
where
$V=\left(\right)open="\{"\; close="\}">{v}_{i},i=\overline{1,c}$ is the set of cluster centres,
$a>0$,
$b>0$,
$m>1$ and
$\eta >1$ are algorithm’s parameters chosen by the enduser,
$\left\right.\left\right$ is the standard Euclidean distance, and
${\gamma}_{i}$ (
$i=\overline{1,c}$) is also a userdefined constant. In (2)
${u}_{ij}$,
$i=\overline{1,c}$,
$j=\overline{1,N}$, have the same meaning as membership values in FCM, and
${t}_{ij}$,
$i=\overline{1,c}$,
$j=\overline{1,N}$, have the same interpretation as typicality values in PCM. The PFCM approach is iterative, and
U,
T and
V are updated over each iteration according to the rules presented in [
8].
3.2. Possibilistic Fuzzy MultiClass Novelty Detector
The PFuzzND approach [
5] is used for the dynamic novelty detection learning scenario. It is an improved version of the Fuzzy multiclass Novelty Detector for data streams [
74], which is a fuzzy generalization of the offline–online framework presented in [
11] for the MINAS approach. All the mentioned algorithms can be divided into two stages. During the first stage, called offline, the model is built using a labelled training dataset. The second stage, during which new classes may emerge and disappear and the old classes may also drift, is referred to as the online or the prediction stage.
In the offline step of the PFuzzND algorithm, a decision model is learned from a labelled set of examples using the PFCM clustering method. To be more specific, for each known class a given number of clusters, ${k}_{class}$, is designed, and this number is a userdefined parameter that can only increase (differently for each class) during the online stage until it reaches the maximum possible value ${c}_{max}$. It also should be noted that ${k}_{class}$ is the same for all known classes during the offline phase, and each class cannot be divided into more than ${c}_{max}\ge {k}_{class}$ clusters. Furthermore, the maximum number of clusters, ${c}_{max}$, that can be created for each class (normal or novel), is another userdefined parameter, which does not change during operation of the algorithm. Finally, it is assumed that generally only the portion ${p}_{offline}$ of instances from the dataset are labelled, and, thus, they are used during the offline stage.
Thus, let us denote the number of the classes, known at the offline stage, as ${L}_{kn}$, then at the beginning of the online stage the decision model is defined as the set of ${L}_{kn}\times {k}_{class}$ clusters found for all ${L}_{kn}$ different known classes. Additionally, an empty external set, called short memory, is created at the end of the offline phase. Examples labelled during the online stage as unknown are stored in the short memory for a time period $ts$, after this time limit, these instances are removed from the mentioned set. The latter can occur if it is established that they do not belong to the existing classes, and do not form novel classes either. Thus, these instances are considered as anomalies.
Moreover, there are four additional userdefined parameters for the online phase of the PFuzzND approach:
The minimum number of instances (denoted as ${N}_{sm}$) in the short memory to start the novelty detection procedure.
The initial threshold, ${\theta}_{init}$, to launch the classification process of the unlabelled instances.
Two adaptation thresholds, ${\theta}_{class}$ and ${\theta}_{adapt}$, used during the classification step.
During the online step, first, for each new instance,
${x}_{j}$, its membership and typicality values related to all clusters, known at the moment, are calculated. Typicality values have more influence here as they are used to determine whether the instance
${x}_{j}$ will be labelled by one of the existing classes or will be marked as unknown. For this purpose, the highest typicality value of the
jth instance and the corresponding existing class,
${y}_{c}$ (
$c=\overline{1,{L}_{kn}}$), are determined. Subsequently, the maximum typicality values are found for all instances considered before the new one arrives (and these instances should belong to the class
${y}_{c}$), and the mean value of these typicalities is calculated. Let us denote this mean value as
${m}_{ct}$, where
t is a timestamp, which shows how many instances were processed after the offline stage, and, therefore, is equal to 1 at the beginning of the online stage. If the highest typicality of the new instance,
${x}_{j}$, is greater than the difference between the obtained mean value
${m}_{ct}$ and the adaptation threshold
${\theta}_{adapt}$, then the instance
${x}_{j}$ belongs to the class
${y}_{c}$ [
5]. It should be noted that the highest typicality of the first instance processed after the offline phase is compared to the initial threshold
${\theta}_{init}$.
If the instance
${x}_{j}$ is labelled, then its typicality value is used to update
${m}_{ct}$, while the PFCM approach is used to update the clusters of the class
${y}_{c}$. Otherwise, the highest typicality of the new instance
${x}_{j}$ is compared to the difference between
${m}_{ct}$ and the adaptation threshold
$\theta class$. If the highest typicality is greater than the mentioned difference, then instance
${x}_{j}$ belongs to the class
${y}_{c}$, but a new cluster with
${x}_{j}$ as its centroid should be created [
5]. If instances belonging to the class
${y}_{c}$ are already divided into
${c}_{max}$ clusters, then the oldest cluster is removed and the new cluster with
${x}_{j}$ as its centre is generated instead of it.
If neither conditions are met, then the instance
${x}_{j}$ will be marked as unknown and stored in the short memory until the novelty detection step will not be executed. The latter happens if the number of instances marked as unknown reaches
${N}_{sm}$. In this case, firstly, the PFCM approach is applied to all the instances in the short memory and the predetermined number
${k}_{short}$ of clusters is designed. After that, for each generated cluster its fuzzy silhouette [
75] is calculated and if the obtained value is greater than 0 and the considered cluster is not empty, then this cluster is evaluated as the valid one. All validated clusters of short memory represent new patterns or novelties.
The next step of the novelty detection procedure consists of calculating the similarity between these validated clusters to the ones already existing (clusters belonging to the known classes), which is achieved by using the fuzzy clustering similarity metric introduced in [
9]. Thus, the known cluster, that is the most similar to the examined cluster from the short memory, is determined. If the value of the mentioned metric for these two clusters is greater than
$\varphi $ (which is another parameter of the PFuzzND approach), then all instances from the examined cluster are labelled the same way as instances of the considered known cluster. Consequently, clusters of the corresponding class are updated by the PFCM algorithm. In this case, the known class has evolved, and we are referring to the concept drift. Otherwise, a new class is created, therefore, we should increment
${L}_{kn}$ by one, and instances from the examined cluster are labelled as belonging to the new class.
If one of the short memory clusters is not validated, then it is discarded and its instances remain in the short memory until the model executes the novelty detection procedure again or decides to remove them all, which can happen if these instances are in the short memory for $ts$ iterations.
3.3. Evolutionary Algorithms
Differential Evolution or DE is a wellknown populationbased algorithm introduced in [
76] for realvalued optimization problems. DE maintains a population of solutions (also called individuals or target vectors) during the search process and the key idea of this algorithm comprises the usage of difference vectors calculated between the individuals in the current populations. These difference vectors are applied to the members of the population to generate mutant solutions. DE contains only three parameters, namely the population size
$NP$, scaling factor for mutation
F and crossover probability
$Cr$.
Generally, the initialization step is performed by randomly generating
$NP$ points
${v}_{ij}$,
$i=\overline{1,NP}$,
$j=\overline{1,P}$, in the search space with uniform distribution within
$\left(\right)$. Here,
P is the dimensionality of the search space. Mutation, crossover and selection operators are then iteratively applied to the generated population. Originally, DE used the rand/1 mutation strategy [
76]; however, most recent DE approaches, including LSHADE [
7], often use the currenttopbest/1 strategy, which was initially proposed for the JADE algorithm [
77]. The currenttopbest strategy works as follows:
where
$pbest$ is an index of one of the best individuals (the quality of individuals is estimated by their fitness values),
k and
l are randomly chosen indexes from the population, and scaling factor
F is usually in the range
$\left(\right)$. Indexes
$pbest$,
k and
l are generated in such a way that they are mutually different and are not equal to
i.
The crossover operation is performed after mutation to generate a trial vector by combining the information contained in the target and mutant vectors. Specifically, during the crossover, the trial vector
${\widehat{v}}_{i}$,
$i=\overline{1,NP}$, receives randomly chosen components from the mutant vector
${\tilde{v}}_{i}$ with probability
$Cr\in \left(\right)open="["\; close="]">0,1$ as follows:
where
$jrand$ is a randomly chosen index from
$\left(\right)$, which is required to ensure that the trial vector is different from the target vector to avoid unnecessary fitness calculations.
After generating the trial vector ${\widehat{v}}_{i}$, the bound constraint handling method is applied. Finally, the selection step is performed after calculating the fitness function value $f\left(u\right)$ in the following way: if the trial vector ${\widehat{v}}_{i}$ outperforms or is equal to the parent ${v}_{i}$ in terms of fitness, then the target vector ${v}_{i}$ in the population is replaced by the trial vector ${\widehat{v}}_{i}$.
3.3.1. LSHADE Algorithm
The LSHADE approach is a modification of the DE algorithm, firstly introduced in [
7]. As previously mentioned, DE has three main parameters, and deciding upon which parameter to employ presents a difficult task, as they have an impact on the algorithm’s speed and efficiency. The LSHADE algorithm uses a set of
$H=5$ historical memory cells containing values
$({M}_{F,q};{M}_{Cr,q})$ to generate new parameter values
F and
$Cr$ for every mutation and crossover procedure. Mentioned parameter values are sampled using a randomly chosen memory index
$q\in \left(\right)open="["\; close="]">1,H$ as follows:
where
$randc({M}_{F,q},0.1)$ and
$randn({M}_{Cr,q},0.1)$ are random numbers generated by Cauchy and normal distributions, respectively. The
$Cr$ value is set to 0 or 1 if it spans outside the range
$\left(\right)$, and the
F value is set to 1 if
$F>1$ or is generated again if
$F<0$. The values of
F and
$Cr$, which caused an improvement to an individual, are saved into two arrays,
${S}_{F}$ and
${S}_{Cr}$, together with the difference of the fitness value
$\Delta f$.
The memory cell with index
h, incrementing from 1 to
H every generation, is updated as follows:
where
S is either
${S}_{Cr}$ or
${S}_{F}$. Then, the previous parameter values are used to set the new ones in the following way:
In the last two formulas, c is an update parameter set to $0.5$, and g is the current iteration number.
The LSHADE algorithm uses the Linear Population Size Reduction (LPSR) approach to adjust the population size
$NP$. To be more specific, it is recalculated at the end of each iteration, and the worst individuals are removed from the population. The new number of individuals depends on the available computational resources:
where
$N{P}_{min}=4$ and
$N{P}_{max}$ are the minimal and initial population sizes, respectively,
$NFE$ is the current number of function evaluations, while
$NF{E}_{max}$ is the maximal number of function evaluations.
Finally, LSHADE uses an external archive A of inferior solutions. The archive, A, contains parent solutions rejected during selection operation, and is filled until its size reaches the predefined value $NA$. Once the archive is full, new solutions replace randomly selected ones in A. The currenttopbest mutation (5) is changed to use the individuals from the archive so that the l index is taken either from the population or the archive A with a probability of $0.5$. Additionally, the archive size, $NA$, decreases together with the population size in the same manner.
3.3.2. NLSHADERSP Algorithm
The NLSHADERSP algorithm is a modification of the LSHADE approach, first introduced in [
6]. It is a further development of the LSHADERSP algorithm, presented in [
78], which uses selective pressure for the mutation procedure. The effect of the selective pressure was studied in detail in [
79]. The mutation strategy proposed for the LSHADERSP approach is called currenttopbest/r, and it is different from the original currenttopbest strategy only with regard to the choosing of indexes
k and
l. To be more specific, for each individual
i the probability
$p{r}_{i}$ is calculated in the following way:
where
${R}_{i}$ is the rank of the
ith individual. In addition, ranks
${R}_{i}$,
$i=\overline{1,NP}$, were set as indexes of individuals in an array sorted by the fitness values, with largest ranks assigned to the best individuals. Finally, indexes
k and
l were chosen with the corresponding probabilities. In NLSHADERSP, the same currenttopbest/r strategy is used, however, the rankbased selective pressure is applied only to the index
l, and only if it is chosen from the population, while the index
k is chosen uniformly.
In contrast to the LSHADE and LSHADERSP approaches, in NLSHADERSP the population size is reduced in a nonlinear manner:
where
$NF{E}_{r}=\frac{NFE}{NF{E}_{max}}$ is the ratio of the current number of fitness evaluations.
The external archive
A is used for the index
l in the currenttopbest/r mutation procedure with probability
${p}_{A}$, and in the NLSHADERSP algorithm this probability is automatically adjusted. The latter is achieved by implementing the adaptation strategy, originally proposed for the IMODE algorithm [
80]. Firstly, the probability
${p}_{A}$ should be within the range
$\left(\right)$, and initially it is set to
$0.5$, unless the archive is empty. Then, the probability
${p}_{A}$ is calculated in the following way:
where
${n}_{A}$ is the amount of archive usage, which is incremented every time an offspring is generated using archive
A,
$\Delta {f}_{A}$ and
$\Delta {f}_{wA}$ are the fitness improvements achieved with and without archive, respectively. It should be noted that the new value of
${p}_{A}$ is checked to be within the range
$\left(\right)$ by applying the following rule:
In NLSHADERSP, both binomial and exponential crossover operators are used with a probability of
$0.5$. The description of the exponential crossover is given in [
6]. For exponential crossover, the SuccessHistory Adaptation (SHA) is applied, but at the beginning of the generation, the crossover rates
$C{r}_{i}$ generated for each individual
i are sorted according to fitness values, so that smaller crossover rate values are assigned to better individuals. For the binomial crossover, value
$C{r}_{b}$ is calculated in the following way:
Finally, the
$pbest$ value for currenttopbest/r mutation in NLSHADERSP is controlled in the same way as in the jSO algorithm [
81]:
Note that the following userdefined parameters were used:
$pbes{t}_{max}=0.4$ and
$pbes{t}_{min}=0.2$. Furthermore, the
$pbest$ parameter linearly decreases as it is performed in jSO.Detailed pseudocode of the NLSHADERSP algorithm is presented in [
6].
3.4. The HFuzzNDA Approach
The HFuzzNDA algorithm, in contrast to the PFuzzND approach, has an additional parameter, namely the maximum possible number of clusters per class, denoted as ${c}_{max}$. This parameter changes over time and is initially set to a minimal value (to be more specific, it is equal to 2). To calculate the new ${c}_{max}$ value, the following notations are used:
First, we check how many instances are classified (this number is denoted as ${N}_{in}$); after the offline stage this number is equal to ${p}_{offline}$, and grows incrementally during the online stage (one new instance per iteration).
Two new parameters are introduced, namely ${p}_{in}$ and ${p}_{cmax}$; ${p}_{in}$ controls the speed at which the ${c}_{max}$ value grows, and ${p}_{cmax}$ shows how much ${c}_{max}$ grows over each iteration.
A given number of considered instances, denoted as ${N}_{old}$, is saved after some number of iterations, ${N}_{old}={p}_{offline}$ right after the offline stage.
Thus, ${c}_{max}$ is updated as shown in Algorithm 1 (note that in Algorithm 1 the temporary variable ${c}_{max}^{unr}$ only grows).
Algorithm 1 ${c}_{max}$ update procedure 
 1:
After the offline stage set ${p}_{in}=0.1$, ${p}_{cmax}=0.01$, ${c}_{max}^{unr}={c}_{max}$  2:
while online stage do  3:
if $\left(\right)open="("\; close=")">{N}_{in}{N}_{old}$ then  4:
${N}_{old}={N}_{in}$  5:
${c}_{max}^{unr}=\left(\right)open="("\; close=")">1+{p}_{cmax}$  6:
${c}_{max}=round\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+{p}_{cmax}$  7:
end if  8:
end while

Evolutionary and biologyinspired optimization algorithms are popular among researchers and are frequently used for anomaly and/or novelty detection. For example, in [
82] the hybrid algorithm based on Fruit Fly Algorithm (FFA) and Ant Lion Optimizer (ALO), and in [
83] the Farmland Fertility Algorithm were used, respectively, for feature selection to reduce the dimensionality of the data. Data, preprocessed in such a way, were later classified as normal or anomalous by using some wellknown machine learning approaches (support vector machines, decision trees,
k nearest neighbours). In this study, we did not apply an evolutionary algorithm as a preprocessing technique, instead it was used for clustering, namely, clusters’ centres were determined by the NLSHADERSP algorithm. To be more specific, NLSHADERSP determined centres of clusters belonging to each class and each individual represented the whole set of clusters’ centres; and no feature selection was applied. Thus, during the offline stage, the NLSHADERSP approach is used to divide each known class into two clusters. For this purpose, function (2) is optimized and each individual represents the centres of the clusters. After that, the PFCM algorithm is used to obtain the membership
U and typicality
T matrices for these classes. The “age” of all clusters, designed during the offline stage, is recorded (they are considered as the oldest clusters).
The online stage starts with Algorithm 1 and then proceeds to check conditions, described for the PFuzzND approach, for each new instance iteratively. However, some changes were implemented to the online procedure. To be more specific, a new merging technique is applied to clusters to check if there is a need to decrease their number, and the NLSHADERSP approach is used to refine clusters belonging to each class. The pseudocode for the online stage is presented in Algorithm 2. Note that in Algorithm 2 $Ncu{r}_{sm}$ is the current number of instances stored in short memory. Furthermore, the NLSHADERSP approach can change the final number of clusters belonging to the considered class. To be more specific, it is executed for all possible variants of the ${k}_{class}$ for a given class from 2 to the current value of ${k}_{class}$, and the best variant is chosen at the end of the optimization process.
Algorithm 2 One iteration of the online phase 
 1:
Set ${\theta}_{init}$, ${\theta}_{adapt}$, ${\theta}_{class}$, ${c}_{max}=2$, ${N}_{in}={p}_{offline}$, ${N}_{old}={p}_{offline}$, $t=0$, $Ncu{r}_{sm}=0$  2:
for each new jth instance do  3:
$t=t+1$  4:
${N}_{in}={N}_{in}+1$  5:
Calculate the highest typicality value $max{t}_{cj}$ and determine the corresponding existing class ${y}_{c}$  6:
For instances, previously labelled as ${y}_{c}$, calculate the mean of the highest typicality values ${m}_{ct}$  7:
if $\left(\right)open="("\; close=")">{p}_{offline}{N}_{in}$ then  8:
${m}_{ct}={\theta}_{init}$  9:
end if  10:
if $max{t}_{cj}>\left(\right)open="("\; close=")">{m}_{ct}{\theta}_{adapt}$ then  11:
The jth instance belongs to class ${y}_{c}$  12:
Update clusters that belong to the class ${y}_{c}$ by using PFCM  13:
if The number of clusters is greater than 2 then  14:
Execute the merging procedure  15:
end if  16:
else if $max{t}_{cj}>\left(\right)open="("\; close=")">{m}_{ct}{\theta}_{class}$ then  17:
The jth instance belongs to class ${y}_{c}$  18:
if The number of clusters that belong to class ${y}_{c}$ is less than ${c}_{m}ax$ then  19:
Increment the number of clusters that belong to class ${y}_{c}$  20:
Create new cluster with jth instance as its center  21:
Execute NLSHADERSP to refine the clusters  22:
Calculate the membership and typicality matrices for all clusters  23:
else  24:
Determine the oldest cluster that belongs to class ${y}_{c}$  25:
Create new cluster with jth instance as its center  26:
Execute NLSHADERSP to refine the clusters  27:
Calculate the membership and typicality matrices for all clusters  28:
end if  29:
if The number of clusters is greater than 2 then  30:
Execute the merging procedure  31:
end if  32:
else  33:
if $\left(\right)open="("\; close=")">Ncu{r}_{sm}+1$ then  34:
$Ncu{r}_{sm}=Ncu{r}_{sm}+1$  35:
Store the jth instance in the short memory  36:
else  37:
Execute the novelty detection procedure  38:
Store the jth instance in the updated short memory  39:
end if  40:
end if  41:
end for

The original PFuzzND algorithm allows an increase in the number of clusters belonging to each class, but not the other way around. Experiments showed that in some cases the instances belonging to the same class might be divided into an excessive number of clusters which can can lead to bad classification results. Thus, in this study, it is proposed to merge clusters belonging to the same class if they are similar to each other, which allows for a decrease in their number.
In [
9], fuzzy similarity metric was introduced. It can be described in the following way: firstly, the dispersions of two considered clusters are calculated, then the dissimilarity between these clusters is determined and, finally, the sum of dispersions is divided by the dissimilarity value [
9]. Each cluster’s dispersion is the weighted sum of distances between instances belonging to this cluster and its centre, averaged by the number of considered instances. Note that the membership values are used as weights. The dissimilarity between two clusters is the Euclidean distance between their centres.
In our study, the typicality values are used as the weight coefficients for calculating the similarity metric. We find two of the most similar clusters belonging to a given class, and determine if they should be merged by using the generalized soft
C index metric [
10] (here, we denote it as
${f}_{merge}$). To use the latter, we have to conduct calculations for all instances that belong to the considered class. However, this can significantly slow the algorithm, so only a part
${N}_{c,B}$ (here,
c is the class number and
B is the batch size) of instances participate in calculating the
${f}_{merge}$ values. The pseudocode of the proposed merging procedure is demonstrated in Algorithm 3.
Algorithm 3 The merging procedure 
 1:
Denote the current set of clusters belonging to the considered class as $CurSet$  2:
for pth cluster belonging to the considered class, $p=\overline{1,{k}_{class}}$do  3:
for qth cluster belonging to the considered class, $q=\overline{1,{k}_{class}}$ do  4:
if $p\ne q$ then  5:
Calculate fuzzy similarities $F{S}_{pq}$ between pth and qth clusters  6:
end if  7:
end for  8:
end for  9:
Find the $maxFS=max\{F{S}_{pq},p=\overline{1,{k}_{class}},q=\overline{1,{k}_{class}}\}$  10:
Determine centres of the most similar clusters $cnt{r}_{1}$ and $cnt{r}_{2}$ corresponding to the $maxFS$ value  11:
if$maxFS>0$then  12:
$kne{w}_{class}={k}_{class}1$  13:
Create a new (empty) set $TempSet$, which can contain $kne{w}_{class}$ clusters  14:
Create new cluster with centre $cntr$ by merging two the most similar clusters  15:
$cntr=0.5\times (cnt{r}_{1}+cnt{r}_{2})$  16:
Fill set $TempSet$ with newly created cluster and all clusters from $CurSet$ except two the most similar  17:
Execute the PFCM algorithm with new $kne{w}_{class}$ clusters from $TempSet$ to update them  18:
Choose ${N}_{b}$ instances belonging to the considered class  19:
For chosen instances and both sets $TempSet$ and $CurSet$ calculate the ${f}_{merge}$ values  20:
Choose the set of clusters with better ${f}_{merge}$ value  21:
end if

Finally, during the novelty detection phase, instances stored in the short memory are divided into clusters using the NLSHADERSP algorithm. The maximum possible number of shortmemory clusters is set to the current value of ${c}_{max}$. After that, membership and typicality matrices are determined. Then, the standard steps of the novelty detection procedure introduced for the PFuzzND approach are conducted.
The general scheme of the proposed approach is demonstrated in
Figure 1.