An Upgraded Version of the Binary Search Space-Structured VQ Search Algorithm for AMR-WB Codec

Adaptive multi-rate wideband (AMR-WB) speech codecs have been widely used for high speech quality in modern mobile communication systems, e.g., handheld mobile devices. Nevertheless, a major handicap is that a remarkable computational load is required in the vector quantization (VQ) of immittance spectral frequency (ISF) coefficients of an AMR-WB coding. In view of this, a two-stage search algorithm is presented in this paper as an efficient way to reduce the computational complexity of ISF quantization in AMR-WB coding. At stage 1, an input vector is assigned to a search subspace in an efficient manner using the binary search space-structured VQ (BSS-VQ) algorithm, and a codebook search is performed over the subspace at stage 2 using the iterative triangular inequality elimination (ITIE) approach. Through the use of the codeword rejection mechanisms equipped in both stages, the computational load can be remarkably reduced. As compared with the original version of the BSS-VQ algorithm, the upgraded version provides a computational load reduction of up to 51%. Furthermore, this work is expected to satisfy the energy saving requirement when implemented on an AMR-WB codec of mobile devices.


Introduction
The development of the adaptive multi-rate wideband (AMR-WB) speech codec [1][2][3][4] aims to considerably improve the speech quality on handheld mobile devices.It is an algebraic code-excited linear-prediction (ACELP)-based coding technique [4,5], and is equipped with nine coding modes with bitrates between 6.6 and 23.85 kbps.Regardless of its excellent speech coding technique, the price paid is a high computational complexity during coding.In other words, the speech quality of a smartphone can be improved at the cost of high battery power consumption, using an AMR-WB codec.
It takes an AMR-WB encoder a tremendous amount of time to quantize immittance spectral frequency (ISF) coefficients in various coding modes [6][7][8][9].The ISF coefficient in AMR-WB is quantized through the combined use of a split vector quantization (SVQ) and a multistage VQ technique, designated as split-multistage VQ (S-MSVQ) [1].Traditionally, a full search is employed to obtain a codeword best matched with an arbitrary input vector, and an enormous computational load is required consequently.Accordingly, efforts have been made to reduce the search complexity of an encoding process in [10][11][12][13][14][15][16][17][18], among which the binary search space-structured VQ (BSS-VQ) search algorithm in [10], a piece of our previous studies, was presented as a simple but efficient way to quantize the ISF coefficient in AMR-WB.A remarkable computational load reduction is achieved therein with a well-maintained speech quality.The BSS-VQ search algorithm was also experimentally validated to outperform an equal-average equal-variance equal-norm nearest neighbor search (EEENNS) algorithm [11][12][13][14] and triangular inequality elimination (TIE)-based algorithms [15][16][17], e.g., the DI-TIE algorithm [15], short for a TIE with a dynamic and an intersection mechanism, and a multiple TIE (MTIE) approach [17].
A deeper investigation reveals that the search performance of the BSS-VQ algorithm is subject to the codeword distributions.As a way to boost the search performance, an upgraded version of the BSS-VQ algorithm is developed herein through a sequential use of the BSS-VQ algorithm and the iterative TIE (ITIE) approach [18].To begin with, an input vector is assigned to a search subspace in an efficient manner using the BSS-VQ algorithm, and subsequently, a codebook search is performed over the subspace using the ITIE approach.Via the use of the codeword rejection mechanisms equipped in BBS-VQ and ITIE, the search load is reduced significantly as intended.The presented algorithm is ultimately validated to well outperform its counterparts, and is fit to meet the energy saving requirement when implemented on an AMR-WB codec of mobile devices.In addition, the Enhanced Voice Services (EVS) codec [19] is a new standard of audio codec optimized for operation with voice and music/mixed content signals.The EVS codec also provides interoperation with AMR-WB over all nine coding modes.That is to say the contribution of this study is also applicable to the speech coding in EVS codec.
The rest of this paper is organized as follows.The ISF coefficient quantization in AMR-WB is described in Section 2. Section 3 presents the proposed search algorithm for ISF quantization.Experimental results are demonstrated and discussed in Section 4, and finally Section 5 concludes this work.

ISF Quantization in AMR-WB
A linear prediction analysis in AMR-WB is illustrated as follows: To begin with, a frame is applied to evaluate linear predictive coefficients (LPCs), followed by a conversion into ISF coefficients.Subsequently, ISF coefficients are quantized by following a process, as will be seen below.

Linear Prediction Analysis
The 16th-order LPC, a i , of a linear prediction filter is evaluated using a Levinson-Durbin algorithm, defined as 1 and the LPC parameters are then converted into the immittance spectral pair (ISP) coefficients for the purposes of parametric quantization and interpolation.The ISP coefficients are defined as the roots of the following two polynomials: F 1 (z) and F 2 (z) are symmetric and antisymmetric polynomials, respectively.It can be proven that all the roots of F 1 (z) and F 2 (z) lie and alternate successively on a unit circle in the z-domain.Also, F 2 (z) has two roots at z = 1 (ω = 0) and z = −1 (ω = π), which can be eliminated by the introduction of the following polynomials, with 8 and 7 conjugate roots on the unit circle respectively, represented as where the coefficients q i are referred to as the ISPs in the cosine domain, and a 16 symbolizes the last predictor coefficient.Both Equations ( 4) and ( 5) can be solved using a Chebyshev polynomial.Subsequently, the 16th-order ISF coefficients ω i , derived from the ISP coefficients, can be acquired via the transformation ω i = arccos(q i ).

Quantization of ISF Coefficients
Ahead of a quantization process, a mean-removed and a first order moving average (MA) filtering are performed on the ISF coefficients to obtain a residual ISF vector, expressed as where z(n) and p(n) respectively represent the mean-removed ISF vector and the predicted ISF vector at frame n using a first order MA prediction.The latter expression is defined as where r(n − 1) denotes the quantized residual vector at the previous frame.S-MSVQ is then performed on r(n).Table 1 gives the structure of S-MSVQ for AMR-WB in the 8.85-23.85kbps coding modes.In stage 1, r(n) is split into two subvectors, that is, a nine-dimensional subvector r 1 (n) and a seven-dimensional subvector r 2 (n), associated with codebooks CB1 and CB2 respectively, for VQ encoding.In the beginning of stage 2, the quantization error vectors are split into five subvectors, symbolized as r (2) i = r i − ri , i = 1, 2, respectively.For example, r (2)  1,1-3 in Table 1 denotes the subvector split from the first to the third components of r 1 , on which VQ encoding is then performed over codebook CB11.Similarly, r (2)  2,4-7 represents the subvector split from the 4th to the 7th components of r 2 , on which VQ encoding is then performed over codebook CB22.Lastly, the Euclidean distance is used as a measure of the squared error ISF distortion in all the quantization processes.

Proposed Search Algorithm
In this paper, an upgraded version of the BSS-VQ search algorithm is presented as an efficient way to reduce the computational complexity when quantizing ISF coefficients in AMR-WB.This is done through a sequential use of BSS-VQ and ITIE, which are detailed by turns below.

The BSS-VQ Search Algorithm
As a prerequisite of a VQ codebook search in the BSS-VQ algorithm, an input vector is assigned in an efficient manner to a subspace, over which a small number of codeword searches is conducted through the combined use of lookup tables and a fast locating technique.As it turns out, the computational load can be reduced significantly.
To begin with, each dimension is dichotomized into two subspaces, and an input vector is then assigned to a subspace according to the entries of the input vector.This is illustrated as follows.For instance, given a 9-dimensional subvector r 1 (n) associated with codebook CB1, there are up to 2 9 = 512 subspaces, to one of which an input vector is then assigned by means of a dichotomy, according to each entry of the input vector.
As defined in [10], a dichotomy position refers to the mean of all the codewords contained in a codebook, expressed as where c i (j) denotes the jth component of the ith codeword c i , and dp(j) the mean value of all the jth components.For example, CSize = 256, Dim = 9 in the codebook CB1, all the dp(j) values are saved, and then presented in Table 2.That is, the values in Table 2 are obtained by way of using (8) to calculate the statistical mean of all the codewords in CB1.A quantity ν n (j) is then defined for vector quantization on the nth input vector x n , expressed as where x n (j) symbolizes the jth component of x n .Subsequently, x n is assigned to subspace k (bss k ), where k is the sum of ν n (j) over all of the dimensions, expressed as Since 0 ≤ k < BSize and BSize = 2 9 = 512 in this article, there are a total of 512 subspaces.For example, given an input vector x n = {16.0,17.1, 18.2, 19.3, 20.4,21.5, 22.6, 23.7, 24.8}, ν n (j) = {2 0 , 0, 2 2 , 0, 0, 0, 0, 2 7 , 2 8 } for each j, 0 ≤ j ≤ 8, and k = 389 are given by ( 9) and (10), respectively.Thus, the input vector x n is assigned to the subspace bss k with k = 389.In this manner, merely a small number of basic operations, i.e., comparison, shift, and addition, are required, meaning that an input vector is efficiently assigned to a subspace as requested.
As stated in Reference [10], a lookup table is prebuilt in each subspace by performing a training mechanism after the dichotomy position for each dimension is determined.The lookup tables give the probability that each codeword works as the best-matched codeword in each subspace, referred to as the hit probability of a codeword in a subspace for short, and symbolized as P hit (c i | bss k ), 1 ≤ i ≤ CSize, 0 ≤ k < BSize.Moreover, a quantity P hit (m | bss k ), 1 ≤ m ≤ CSize, is defined as the m ranked probability that a codeword hits the best-matched codeword in subspace bss k for sorting purposes.For example, P hit (m|bss k )| m=1 = max c i {P hit (c i |bss k )} denotes the highest hit probability in bss k .As can be seen, the lookup table in each subspace gives the ranked hit probability in descending order and the corresponding codeword.
In the encoding procedure of BSS-VQ, the cumulative probability P cum (M | bss k ) is firstly defined as the sum of the top M P hit (m | bss k ) in bss k , namely Subsequently, given a threshold of quantization accuracy (TQA), a quantity M k (TQA) refers to the minimum value of M that meets the condition P cum (M | bss k ) ≥ TQA in bss k , namely Finally, a BSS-VQ encoding procedure is described below as Algorithm 1: Algorithm 1 Encoding procedure of BSS-VQ Step 1.Given a TQA, M k (TQA) satisfying ( 12) is found directly in the lookup table in bss k .
Step 2. Referencing Table 2 and by means of ( 9) and ( 10), an input vector is assigned to a subspace bss k in an efficient manner.
Step 3. A full search for the best-matched codeword is performed among the top M k (TQA) sorted codewords in bss k , and then the index of the found codeword is output.
Step 4. Repeat Steps 2 and 3 until all the input vectors are encoded.
In short, Table 2, the first lookup table, is prebuilt by performing (8).Subsequently, the second lookup table regarding P hit (m | bss k ) and the corresponding codeword is built for each subspace by following the training mechanism.Accordingly, the VQ encoding can be completed using Algorithm 1.

The ITIE Search Algorithm
As a preliminary to the ITIE algorithm, a TIE algorithm is briefly described below.First of all, a codeword captured from the preceding frame is viewed as a reference codeword c r in the current frame, and the Euclidean distance between c r and an input vector x, symbolized as d(c r , x), is calculated [16].A set, consisting of all the codewords c i and meeting the condition d(c r , c i ) < 2d(c r , x), is referred to as a candidate search group (CSG), represented as where CSize denotes the total number of the codewords.CSG(c r ) is the search space over which a current codebook search is conducted.Additionally, a lookup table, listing all the codewords sorted by the Euclidean distance, is constructed in advance for the TIE search algorithm.Subsequently, the ITIE algorithm works as illustrated in Figure 1 and stated in Algorithm 2. Just as in the TIE case, a codeword captured from the preceding frame is viewed as a reference codeword c r in the current frame, and then a codebook search is performed over CSG(c r ) using (13).As in the TIE case, the condition d(c r , c k ) < 2d(c r , x) is checked.If the condition is true, then d(c k , x) is computed, and another condition d(c k , x) < d(c r , x) is checked.If the condition is true, then c r is replaced with c k , and the search scope CSG(c r ) is updated.CSG(c r ) updates and shrinks each time through the above-stated loop.

Algorithm 2 Search procedure of ITIE
Step 1. Build a TIE lookup table .Step 2. Given a c r , compute d(c r , x), and then CSG(c r ) is found directly in the TIE lookup table, that is, where c k and N(c r ) denote the codewords and the number thereof in CSG(c r ), respectively. Step

Upgraded Version of the BSS-VQ Search Algorithm
The search performance of the BSS-VQ algorithm is subject to codeword distributions.As a way to ease the distribution effect, an input vector is assigned to a search subspace in an efficient manner using BSS-VQ at stage 1 of this work, and a codebook search is conducted over the subspace using ITIE as an alternative to the full search algorithm.This is simply because a codeword of interest therein can be well located either by ITIE or by the full search counterpart.This paper presents a two-stage codebook search algorithm as an effective way to remarkably reduce the amount of codebook searches and the time complexity in the evaluation of Euclidean distances.This is done via a sequential use of codeword rejection mechanisms provided by BSS-VQ and ITIE, respectively.This codebook search mechanism is presented in Algorithm 3, stated as follows: Algorithm 3 Search mechanism of the upgraded version Step 1.Initial setting: Given a TQA, Mk(TQA) satisfying ( 12) is found directly in the lookup

Upgraded Version of the BSS-VQ Search Algorithm
The search performance of the BSS-VQ algorithm is subject to codeword distributions.As a way to ease the distribution effect, an input vector is assigned to a search subspace in an efficient manner using BSS-VQ at stage 1 of this work, and a codebook search is conducted over the subspace using ITIE as an alternative to the full search algorithm.This is simply because a codeword of interest therein can be well located either by ITIE or by the full search counterpart.This paper presents a two-stage codebook search algorithm as an effective way to remarkably reduce the amount of codebook searches and the time complexity in the evaluation of Euclidean distances.This is done via a sequential use of codeword rejection mechanisms provided by BSS-VQ and ITIE, respectively.This codebook search mechanism is presented in Algorithm 3, stated as follows:

Algorithm 3 Search mechanism of the upgraded version
Step 1.Initial setting: Given a TQA, M k (TQA) satisfying ( 12) is found directly in the lookup table in bss k .A TIE lookup table is also built.
Step 2. Referencing Table 2 and through ( 9) and ( 10), an input vector is efficiently assigned to a subspace bss k .And then a set, composed of the top M k (TQA)-sorted codewords in bss k , is denoted by CSG(bss k ) and formulated as Step On the issue of speech quality, it was concluded in Reference [10] that a nearly lossless speech quality is provided in BSS-VQ at a TQA no less than 0.90.This fact definitely applies to the presented search algorithm, since the upgraded version of BSS-VQ offers exactly the same speech quality as the original version.

Experimental Results
In this work, a performance comparison is conducted among the EEENNS, DI-TIE, ITIE, and the original and upgraded versions of BSS-VQ.Since EEENNS, DI-TIE, and ITIE provide a 100% search accuracy in comparison with the full search approach, there is no degradation in the speech quality.Furthermore, as mentioned in the previous section, a nearly lossless speech quality is provided in both the original and the upgraded versions of BSS-VQ.For this reason, performance is compared in terms of the search load, i.e., the average number of searches.For testing purposes, a speech database, including one male and one female speaker, in total requires more than 221 MB of memory, takes up more than 120 min, and covers 363,281 speech frames as well.
Table 3 gives a comparison on the average number of searches among the full search, EEENNS, DI-TIE, and ITIE, while Tables 4 and 5 list the search load versus the TQA values for the original and the upgraded versions of BSS-VQ, respectively.Furthermore, with the search load required in the full search approach as a benchmark, Table 6 compares the load reduction (LR) among various search algorithms listed in Tables 3-5.Obviously, a high search load reduction is indicated by a high value of LR.Moreover, columns 2 and 3 give LRs in codebooks CB1 and CB2, respectively.The overall LR refers to the total search load, defined as the sum of the average number of searches multiplied by the vector dimension in each codebook, and is employed to compare the total search load required during an entire VQ encoding procedure for an input vector, as presented in the rightmost two columns therein.
As can be found in Tables 3-5, the upgraded version of BSS-VQ outperforms its counterparts to a great extent with respect to the search load required in each codebook.Particularly, Table 6 gives a clear view of the LR comparisons.As tabulated therein, the codebook search, required in CB1 and CB2 at stage 1, occupies a high percentage of the search load in an entire VQ encoding procedure.For instance, the full search algorithm requires a search load of 4096 (= 256 × 9 + 256 × 7) at stage 1, accounting for 77% of a 5280 overall search load, as tabulated in Tables 3 and 6.A further observation in Table 6 reveals that the corresponding values of LR in CB1 and CB2 are close among various search algorithms, but exclusive of the original version of BSS-VQ, and this fact applies to the values of overall LR as well.This finding validates the argument that the search performance of BSS-VQ is affected by codeword distributions, and the adverse distribution effect can be eased using the presented search algorithm.Furthermore, the upgraded version of BBS-VQ provides an overall LR of up to 94.20% at TQA = 0.90, and is experimentally validated to outperform its counterparts.
With the original version of BSS-VQ as a benchmark, Table 7 exclusively gives an overall search load reduction using the presented algorithm versus TQA.As tabulated therein, the values of LR range between 42.05 and 51.80%, a significant outperformance over the benchmark.On the other hand, Table 8 presents the memory required of lookup tables for the proposed algorithm.Table 8 is divided into three portions: Storage for BSS spaces, dichotomy positions, and TIE lookup tables.The memory size of each BSS space is equal to BSize x CSize x 4 (for FLOAT data type) bytes, each dichotomy position = Dim x 4 (FLOAT data type) bytes, and each TIE tables = CSize x (CSize -1) x 5 (UINT8 + FLOAT data type).Accordingly, a total of 1.45 MB of memory is required for whole lookup tables in Table 8.The memory requirement is very small in this work.Finally, in our previous experiments, the linear prediction analysis and quantization occupied 8-13% of the total operation in an AMR-WB encoder, depending on the coding modes.In other words, this work could eliminate 7.5-12.2% of the total operation in the AMR-WB encoder.

Conclusions
A two-stage search algorithm is presented herein for the improvement of search performance in the ISF vector quantization of an AMR-WB speech codec.At stage 1, an input vector was assigned to a search subspace in an efficient manner using the BSS-VQ algorithm, and a codebook search was performed over the subspace using the ITIE approach at stage 2. Through the use of the codeword rejection mechanisms equipped in both stages, the search load was reduced remarkably as intended.Experimental results show that the upgraded version of BBS-VQ provided an overall LR of up to 94.20% at TQA = 0.90, and was validated to outperform its counterparts.As compared with the original version of the BSS-VQ algorithm, the upgraded version provided a computational load reduction of up to 51%, a significant outperformance over the benchmark.This work would be beneficial for reaching the energy saving requirement when implemented on an AMR-WB codec of mobile devices.On the other hand, the contribution of this work is also applicable to the speech coding in EVS codec, which is a new standard of audio codec and provides interoperation with AMR-WB.In future work, the authors will pay their efforts to make further improvements on the computational complexity of EVS codec.

3 .
Starting at k = 1, obtain d(c r , c k ) from the lookup table.

Table 2 .
Dichotomy position for each dimension in the codebook CB1.

Table 3 .
Average number of searches among various approaches in the 8.85-23.85kbps modes.

Table 4 .
Search load versus threshold of quantization accuracy (TQA) values in the 8.85-23.85kbps modes for the original version of the binary search space-structured vector quantization (BSS-VQ) algorithm.

Table 5 .
Search load versus TQA values in the 8.85-23.85kbps modes for the upgraded version of BSS-VQ.

Table 6 .
Load reduction comparison among various algorithms.

Table 7 .
Overall search load comparison between BSS-VQ and proposed algorithms.

Table 8 .
Memory required of lookup tables for presented algorithm.