On Channel Estimation in LTE-Based Downlink Narrowband Internet of Things Systems

: Narrowband Internet of Things (NB-IoT) systems were speciﬁed by 3GPP in release 13 as a low power wide area network (LPWAN) technology to operate with a very narrow bandwidth of 180 kHz only. Due to fragile radio signal operating conditions (where a signal is weaker than noise), NB-IoT channel status becomes highly complex. Therefore, an effective and low complexity channel estimation will perform a signiﬁcant role in the receiver operation. The linear minimum mean square error (LMMSE) scheme is very effective in estimating the channel but introduces massive complexity because of having complex matrix inversion. In this paper, we ﬁrst derive the analytical model of the signal for long-term evolution (LTE)-based NB-IoT downlink systems and propose a reduced complexity LMMSE channel estimation for the downlink NB-IoT systems by applying singular value decomposition (SVD) technique along with partitioning the whole channel matrix into small submatrices. Furthermore, we apply the overlap banded technique to optimize the performance of the proposed channel estimator. As a result of exploiting several submatrices instead of a larger channel matrix, the operational complexity is signiﬁcantly optimized. Lastly, we propose a polyphase ﬁlter structure for implementing the interpolation procedure instead of the conventional interpolation method to further optimize the performance and complexity of the proposed channel estimator further. The performance of the proposed technique has been justiﬁed by the mean square error (MSE), bit error rate (BER), and instantaneous throughput for the related signal-to-noise ratio (SNR). The system complexity is veriﬁed by the number of complex multiplications used. Simulation evaluations indicate that with the sacriﬁce of negligible performance, the proposed modiﬁed LMMSE technique along with the proposed interpolation possesses a good balance between the performance and the system complexity that could help the proposed techniques to be applied successfully in the low complexity NB-IoT systems.


Introduction
With the in-depth research of the emerging 5th generation cellular network (5G), the Internet of Things (IoT) technologies are gaining ground as a promising paradigm. Conceptually and practically, almost every class of smart devices such as consumer electronics, actuators, sensors, and mobile phones can be connected and incorporated among them through unique addressing modes with IoT technologies [1,2]. The significant advantages in performance [3], heterogeneity [4], and big data processing [5] of IoT technology has made it a key technology of the revolutionary 5G network. IoT communication is classified into high data rate (e.g., video signal streaming) and low data rate (e.g., meter reading) services in consideration of transmission rate [6]. In view of the range of transmission, IoT technology is divided into short-range and long-range communication [7,8]. Wi-Fi, Zigbee, and Bluetooth represent short-range IoT communications and are usually applied been investigated by S. Ali et al. in [22] for the downlink NB-IoT systems by applying additional operations. S. Ali et al. also investigated narrowband demodulation reference signal (NDMRS)-based least square (LS) and minimum mean square error (MMSE) channel estimation for the NB-IoT uplink systems in [23]. In this work, the authors also investigated the peak to average power ratio (PAPR) reduction based on pulse shaping as an uplink transmit filter. However, these works are apart from the study of the justifications under different channel conditions in NB-IoT platform. Channel estimation and equalization with LMMSE and zero forcing are investigated by V. Savaux et al. in [24] for NB-IoT uplink. The authors studied the effectiveness of the LMMSE channel estimator with the MMSE equalizer as a good fit. A reduced complexity MMSE (sequential MMSE) channel estimation in the presence of random phase noise for the downlink NB-IoT systems has been well studied by F. Rusek et al. in [25]. However, the effect of coherence time variation is not discussed in both works above. Movable LS (MLS) technique with very insufficient investigations of different interpolation methods for NB-IoT downlink is studied in [26,27]. Both of these methods possess poor performance because of exploiting LS estimator. Recently, a pilot-based hybrid channel estimation method based on the time domain wiener filter and frequency domain maximum likelihood estimator (MLE) was studied by the authors in [28] for downlink NB-IoT systems. The complexity is mentionably reduced by the hybrid application of MLE and wiener filter method while offering a good performance.
The MMSE technique is more efficient than the LS method but possesses very high computational complexity, and hence it is very power-hungry as well. As we know that NB-IoT is a low power and low complexity system, a low complexity channel estimation is a prime demand for the receiver performance. In this regard, the linear minimum mean square error (LMMSE) technique, which has less complexity than MMSE, can be the right choice. However, LMMSE also possesses higher complexity than the LS method. Dividing the channel matrix depending on the coherence bandwidth method was studied for the complexity reduction of the LMMSE technique in [29] by M. Noh et al. for the traditional OFDM systems. Furthermore, the well-known singular value decomposition (SVD) scheme was studied to ease the complexity of the LMMSE technique for a traditional OFDM system by O. Edfords et al. in [30] and for the LTE downlink system in [31], by S. Wang et al. It has been proven that SVD is one of the most efficient methods for reducing complexity of any complex matrix-based algorithm compared to other methods. However, a complete and efficient channel estimation scheme for the downlink NB-IoT systems is still absent in the current literature.
In this work, a reduced complexity LMMSE channel estimation technique based on the SVD method along with a new interpolation method that can be practically applied to the LTE-based NB-IoT downlink system is proposed. This is an extension of our previous work published in [32]. We have further improved its performance and complexity with the application of a new interpolation method and investigated the operation of the proposed channel estimation method under different communication channel models (TU 1 Hz and EPA 1 Hz). To the best of the author's knowledge, it is for the first time that the LMMSE technique with complexity reduction has been studied, specifically to downlink NB-IoT systems. The key contributions of this work are summarized below:

•
We discuss the overview of NB-IoT technology briefly by considering its deployment, signals and physical channels, the structure of the downlink frame, and resource allocation according to 3GPP release 13. In addition to this, we derive the downlink NB-IoT received signal model from the transmitted signal keeping relation with channel impairments. Narrowband reference signal (NRS) generation procedure for channel estimation and its mapping to the frequency-time grid is also represented.

•
We propose a computationally simple LMMSE channel estimation method named overlap banded SVD LMMSE for the downlink NB-IoT systems. LMMSE is the optimal channel estimator. We reduce the complexity of the LMMSE estimator by the application of SVD, banded SVD, followed by the overlap band technique while keeping the performance close to the LMMSE. Computer simulation and complexity analysis prove that the proposed overlap banded SVD LMMSE channel estimation technique makes a good balance between performance and complexity, which is important to be a better candidate for low complexity, and low power NB-IoT systems. • Lastly, we propose a new interpolation method by the application of polyphase decomposition of the finite impulse response (FIR) filter of the linear time dimensional interpolation method, which is well-known as a conventional interpolation approach [20][21][22][23][24][25]28] to boost the performance of the channel estimation and complexity of FIR filter design. In this case, the large FIR filter length is decreased to several small sections to increase the efficiency of the interpolator filter, and hence the inner complexity of the interpolator is reduced once more. Link level simulations with the state of art methods prove that the proposed overlap banded SVD channel estimation scheme with the proposed polyphase decomposed FIR filter interpolation method outperforms the LS, banded SVD LMMSE, and the overlap banded SVD LMMSE channel estimation method with the conventional (linear time dimensional) interpolation scheme in every aspect.
The rest of the part of this paper is arranged as follows: In Section 2, we present a brief overview of NB-IoT downlink technology. We represent the NRS generation and mapping, along with NB-IoT downlink signal model in Section 3. Theoretical analysis of the proposed channel estimation and the proposed interpolation scheme is presented in Sections 4 and 5. In Section 6, computational complexity is demonstrated. Finally, we provide the performance analysis and conclusion in Sections 7 and 8, respectively.

Overview of NB-IoT Downlink Technology
3GPP started focusing on developing the specification of NB-IoT since the beginning of 2014. In the beginning, NB-IoT was concerned with low power and low-end devices known as cat NB-1 with the option of sending a small amount of data in parallel [33]. It can be deployed in three distinct working modes: in-band, guard-band, and stand-alone. The pictorial illustration of all the modes are shown in Figure 1. At least one or several physical resource blocks (PRBs) of 180 kHz from the LTE network is needed to deploy the in-band mode, while the entire transmission power will be distributed to the base stations (eNB) of LTE and NB-IoT systems. An unused guard-band is enough to deploy the guard-band mode, which is permitted for only 5 MHz or higher operating frequency of LTE. In-band and guard-band mode reuse the spectrum of LTE base station and other resources to work within the narrow bandwidth of only 180 kHz [34]. The coexistence of NB-IoT and LTE is justified and proven through in-depth study and simulation [35]. The stand-alone mode can be operated inside the GSM bandwidth by reframing its one or more 200 kHz carrier. As stated above, because of exploiting existing LTE and GSM networks, the deployment of NB-IoT will not cost extra deployment money and time to be in operation. As NB-IoT is designed based on the existing LTE network, its signals and channels are also inherited from LTE with the specified simplification and required modification. All the signals and channels of NB-IoT for both downlink and uplink are shown in Table 1.
Only frequency division duplex (FDD) mode of communication is supported by NB-IoT in a half-duplexing manner. With only QPSK modulation support in the downlink transmission, NB-IoT inherits the baseband numerology of LTE downlink with certain restrictions and modifications. Orthogonal frequency division multiplexing (OFDM) is used as in LTE with 15 kHz subcarrier spacing. The NB-IoT basic time unit is defined as T s = 1/(15, 000 × 2048) seconds. A pair of slots constitute one subframe, and ten subframes constitute one frame, where the duration of slot, subframe, and frame are T slot = 15, 360 × T s = 0.5 ms, T s f rame = 0.5 × 2 = 1 ms, and T f = 307, 200 × T s = 10 ms, respectively. In a radio frame, slot number is defined by n s , where n s ∈ {0, . . . , 19}. The frequency-time grid structure of NB-IoT downlink within a frame of a subcarrier spacing of 15 kHz is demonstrated in Figure 2. Every single component of the resource grid is a unique resource element which is the basis of a resource grid. It is represented by index pair (k,l) in each slot, where k = 0 . . . N DL RB N RB SC − 1 and l = 0 . . . N DL symb − 1 are representing frequency  Table 2. A maximum TBS size of 680 bits is defined for NB-IoT downlink. The technique of repetition of the associated control signal and user data has been taken to get wide area coverage. A total of 128 and 2048 repetitions are allowed in uplink and downlink NB-IoT transmission in order to get successful signal decoding at the receiver end in the situation where the noise is even more powerful than the signal.    time domain. TBSs are represented as the function of subframe number and the level of modulation and coding scheme (MCS), as shown in Table 2. A maximum TBS size of 680 bits is defined for NB-IoT downlink. The technique of repetition of the associated control signal and user data has been taken to get wide area coverage. A total of 128 and 2048 repetitions are allowed in uplink and downlink NB-IoT transmission in order to get successful signal decoding at the receiver end in the situation where the noise is even more powerful than the signal.

NB-IoT Downlink System Model
A complete NB-IoT downlink system block diagram with NRS sequence, related channel estimation, and equalization along with wireless channel and associated noise are shown in Figure 3, according to the downlink signal processing chain described in [36][37][38]. Here, narrowband physical downlink shared channel (NPDSCH) is used as the leading data-bearing channel by the downlink transmitter (i.e., LTE eNB). Input binary information reaches to the coding unit as a single transport block for a number of resource blocks (RB) in every downlink cell. The scheduling of RB is arranged according to [36]. According to 3GPP [37], the NPDSCH data processing sequence contains transport block cyclic redundancy check (CRC) attachment (e.g., 24 bits with generator polynomial g CRC24A (D), g CRC24B (D), 16 bits and 8 bits with generator polynomial g CRC16 (D) and g CRC8 (D) 1/3 rate-based tail-biting convolutional coding, respectively. Finally, NPDSCH gets the input after the rate matching operation.

NB-IoT Downlink System Model
A complete NB-IoT downlink system block diagram with NRS sequence, related channel estimation, and equalization along with wireless channel and associated noise are shown in Figure 3, according to the downlink signal processing chain described in [36][37][38]. Here, narrowband physical downlink shared channel (NPDSCH) is used as the leading data-bearing channel by the downlink transmitter (i.e., LTE eNB). Input binary information reaches to the coding unit as a single transport block for a number of resource blocks (RB) in every downlink cell. The scheduling of RB is arranged according to [36]. According to 3GPP [37], the NPDSCH data processing sequence contains transport block cyclic redundancy check (CRC) attachment (e.g., 24 bits with generator polynomial 24 24 ( ), ( ),

NRS Generation and Mapping
Narrowband reference signal (NRS) carries out the duty of channel estimation and signal repair, respectively. NRS is injected inside a subframe consisting of one shared or control channel. At the receiver side, transmitted NRSs are calculated from the provided synchronization signal. NRS sequence , ( )

NRS Generation and Mapping
Narrowband reference signal (NRS) carries out the duty of channel estimation and signal repair, respectively. NRS is injected inside a subframe consisting of one shared or control channel. At the receiver side, transmitted NRSs are calculated from the provided synchronization signal. NRS sequence r l,n s (m) is generated according to the sequence below [38]: where the radio frame slot number is denoted by n s , l which represents the OFDM symbol number within the slot and N max.DL RB is the largest bandwidth by the multiplication of the number of subcarriers in NB-IoT downlink, which is set to the value 110 for legacy LTE. C i is the length-31 gold sequence defined pseudo-random sequence for the i th element which is initiated for each OFDM symbol at its starting with c init = 2 10 · (7 · (n s + 1) + l + 1) · (2 · N cell where N CP = 1 because of only normal CP application in NB-IoT and n s = 10 n s /10 + n s mod(2) for frame structure type 3 n s otherwise Different resource elements are occupied by NRS in a subframe. The elements of a subframe are denoted by k in the frequency domain, which denotes the subcarrier number, and l in the time domain, which represents the symbol number [36]. The resource mapping formula for NRS is a where a (p) k,l is the complex-valued modulation symbol, p is the antenna port, and n s represents the slot number. k and l are defined by the following formulae: where v shi f t and v represent different frequency domain positions for different reference signals.
Here, two different values of l and m vindicate that in each slot, four resource elements will be assigned to the corresponding NRS, and eight resource elements in each subframe. The resource element mapping for a single antenna port is shown in Figure 4. NRS is transmitted with data symbols at the transmitter and is used for estimating the channel performance in the downlink NB-IoT system at the receiver. NRS symbols are inserted into the specifically assigned subcarriers at every NB-IoT downlink slot. On each content of the resource grid which consists of NRS, an inverse discrete Fourier transform (IDFT) process is executed for converting it into a time-domain reference sequence with the addition of CP.

Analytical Downlink Signal Model
We assume that the LTE eNB transmits a bitstream (a block of bits) represents the transmitted bits on the

Analytical Downlink Signal Model
We assume that the LTE eNB transmits a bitstream (a block of bits) bit represents the transmitted bits on the NPDSCH in one subframe for a codeword. The codeword bit is scrambled before modulation using a specified scrambling sequence for NB-IoT eNB to randomize the interference in neighboring cells and to ensure that the transmission from various cells is individual to the decoder at the NB-IoT UE receiver. Hence, we get the resulting scrambled block of bits where c (q) (i) is the scrambling sequence defined by length-31 gold sequence and i = 0, 1, 2, . . . , N bit − 1. The scrambling order will be initialized at the starting of each subframe, and the value of initialization c init will be selected depending on the type of NPDSCH. If the NPDSCH contains BCCH then the scrambling sequence generator shall be initialized with Otherwise, the sequence generator shall be initialized with where n RNTI represents the radio network temporary identifier (RNTI), n s is the initial slot of the transmitted codewords, and N Ncell T , for i = 0.1, . . . , M layer symb − 1. v and M layer symb represent the layer and modulation symbol number, respectively. For NB-IoT single antenna transmission, the number of layers is one (i.e., v = 1). Therefore, the layer mapping shall be finally defined, according to [36], by symb . This layer mapped signal entered as input block vector to the precoder and creates the block of vectors denotes the corresponding signal for antenna port p. For the transmission of a single antenna port, the defined precoding is where p ∈ {0, 4, 5, 7, 8, 11, 13, 107, 108, 109, 110} is the single antenna port number that will be used for NPDSCH transmission and i = 0, 1, . . . , M ap symb − 1, M ap symb = M layer symb . The complex valued symbol block y (p) (0), . . . , y (p) (M ap symb − 1) will be mapped in the frequencytime (k,l) resource grid for all the antenna ports which is exploited for NPDSCH transmission. The mapping sequence will start with y (p) (0) in the resource element (k,l) on the corresponding antenna port p in the increasing order by considering k and l consecutively. The mapping will start from the initial slot and will stop at the second slot in the corresponding subframe. While the NPDSCH will not contain the BCCH, the subframe will be repeated for min(M NPDSCH rep , 4) − 1 additional subframes before proceeding with the mapping of y (p) (·) in the next subframe.
An inverse DFT (IDFT) operation is applied after the physical resource mapping to convert the frequency domain data into time-domain signal. The time-domain continuous signal x l (t) in OFDM symbol l on antenna port p in NB-IoT downlink slot is defined in [38] by for k,l is the content of resource element (k,l) on the antenna port p. According to release 13, only normal cyclic prefix (CP) length N CP,l is allowed, which is inherited from existing LTE.
After generating the time dimension baseband signal according to Equation (10), an RF front end is applied to upconvert the signal. This upconverted signal is transmitted through the multipath wireless fading channel, and the delay spread of this fading channel is considered smaller than the added cyclic prefix length to protect from the intersymbol interference. At the receiver, the transmitted signal is collected from several wireless multipath links with their individual additive noise and finally composed together as the received signal. The channel impulse response (CIR), along with the convolution of the transmitted signal and additive noise, represents the received signal as shown below: where y(t) and n(t) represent the received signal and additive white Gaussian noise (AWGN) of zero-mean with the variance σ 2 n , respectively. The CIR with L distinct complex taps of the multipath wireless fading channel is denoted by h(t) and can be expressed by the following equation: where τ i and β i indicate the delay and attenuation of fading channel at the i th path, respectively. Hence, the delayed version of the noisy received signal will be represented as At the receiver end, the CP is removed first, and the reverse operation of the NPDSCH is performed along with the channel estimation and equalization with the help of a known NRS.

NB-IoT Downlink Channel Estimation
At first, we calculate the estimates of the channel by considering the subcarriers allocated in a PRB of the symbols (l = 3, 8) of a subframe having 15 kHz subcarrier spacing and containing NRS sequences. Thereafter, we acquire the knowledge of the estimated channel at other symbols in a PRB by applying linear time dimensional interpolation to get the total channel estimation. NPDSCH and NRS hopping are ignored in the current work to make this technique commonly usable in all multicarrier communication schemes. In this work, the NRS (pilot) assisted channel estimation is performed by a well-known and efficient LMMSE channel estimation algorithm. Let us consider that the transmitted and received NRSs are A and B, respectively, and the frequency response of the channel for the LMMSE estimate is Ψ. We divide the entire channel estimation process into four steps for ease of explanation. A simplified LMMSE estimate is achieved by exploiting the properties of transmitted NRS as the first step; to reduce the number of complex multiplications, optimal rank reduction formulae are applied in the second step; in the next phase, the autocorrelation matrix of the channel is split into a few submatrices for efficient understanding. Lastly, we overlap the submatrices to optimize the performance of the channel estimation technique in the fourth step.

LMMSE Estimation
The channel response at the NRS location while deducting the mean square error (MSE) of the real value and the estimated value is obtained by where Φ ΨΨ = E[ΨΨ H ] is the autocorrelation matrix of size N × N of the channel at the NRS positions, σ 2 is the added noise variance of the channel, which is defined by β/SNR (where the value of β is a constant and depends on signal constellation). For the case of NB-IoT downlink, QPSK modulation is used, and the value of β is 1. Furthermore, When the OFDM symbol changes in A, the LMMSE estimation in Equation (14) needs to inverse the matrix every time, and hence it possesses large computational complexity. It can be reduced by averaging the estimator over all transmitted data [30]. Therefore, we can turns into an identity matrix I L . Therefore, the LMMSE estimation from Equation (14) becomes Now, A is not the part of the matrix calculation anymore and hence the term σ 2 (I L ) does not need to be evaluated when the value of A changes each time. If Φ ΨΨ and SNR are set to a fixed value, the matrix Φ ΨΨ (Φ ΨΨ + σ 2 (I L )) −1 Ψ LS will be calculated once only but if the corresponding values of Φ ΨΨ and SNR change, then it needs to be calculated again.
In reality, it is very difficult to further calculate the result by using Equation (16) because of the large size of the channel correlation matrix Φ ΨΨ . Hence, to overcome this difficulty of high complexity, we apply the singular value decomposition (SVD) technique to get a low ranked channel autocorrelation matrix Φ ΨΨ according to [30] as given below.

SVD with LMMSE Estimation
After the application of SVD, the channel autocorrelation matrix Φ ΨΨ of Equation (16) where N singular value of the diagonal matrix ∑ is expressed by ≥ 0 at its diagonal, and U is denoting the unitary matrix. Thereafter, by exploiting the unitary matrix property and placing the value of Equation (17) into Equation (16), we get The optimal rank q of the SVD matrix can be obtained as in [30]: where ∆ q is the q × q upper left corner of ∆ . The values of the diagonal matrix ∆ are defined as [30,31] where α(N) is the channel energy (variance) of the received information after Ψ LS is transformed by the U H . Since U is a unitary matrix, we can consider this transformation as the rotation of the vector Ψ LS , and hence all the components are uncorrelated [39]. The desired rank in the low-rank estimator is determined by the space dimension of the corresponding time and frequency related signal. According to [40], this dimension is 2BT + 1, where T is the interval of time and B is the one-sided bandwidth. Similarly, the singular values magnitude of Φ ΨΨ will be smaller after about L + 1 values, where L indicates the length of cyclic prefix (2B = 1/T s , T = LT s , and 2BT + 1 = L + 1).

LMMSE Estimation with SVD and Banded Technique
As we know that the LMMSE technique uses channel correlation and noise weighting property, the complexity reduction by SVD application is not significant enough to apply it in the low complexity NB-IoT system. Therefore, depending on the available bandwidth, the channel vector is further subdivided to reduce the complexity as in [29,31]. Hence, we can represent the channel vector by If every OFDM symbol contains L number of NRSs, and the desired subvector size is S, then for Equation (20) we get R = L/S, and Ψ r = Ψ S(r−1)+1 , .., Ψ S(r−1)+S T which is the r th subvector of Ψ for r = 1, . . . , R. Therefore, the correlation matrix Φ ΨΨ from Equation (16) becomes The largely correlated components of this matrix will always take a better influence on the channel estimation operation. In this case, less correlated components will have a minor impact. As the less correlated components are comparatively at a distance from the coherence bandwidth, they will take a minor influence on the channel estimation operation. Hence, we can set the value of these components to zero, as shown below: Electronics 2021, 10, 1246

of 25
Therefore, by taking only diagonal values, we get the approximate banded matrix as given below: Now, from Equations (16), (18), and (23), we obtain the approximation of the LMMSE estimation as calculated below: According to the above concept, we subdivide the channel matrix into several smaller matrices as shown in Figure 5. The r th subvector Ψ r,LMMSE of order S × 1 can be found from Equation (24) as where r is defined by the floor function as given below: and Ψ r,LS is the LS estimation of r th subvector, which is ,..., Therefore, the final LMMSE estimation vector by applying the approximation banded approach becomes

LMMSE Estimation with SVD and Overlap Banded Technique
The banded submatrix utilization is not efficient at the edge subcarrier because of exploiting comparatively low correlated channel statistics for estimating the channel. To overcome this drawback, overlapped subcarriers are introduced, as in [29], to each small matrix as depicted in Figure 6a. We estimate the coefficients of the channel by the submatrices in the overlapped situation, and hence it is expected that the edge subcarrier performance will improve for all submatrices. We consider the overlap distance as S/2, and the total submatrices are (2 / ) 1 R L S = − for the overlapped strategy. Therefore, the channel autocorrelation matrix can be split into sections (b) and (c), as shown in Figure 6. Depending on the above depiction, Figure 6b gives us the vector of the channel estimation for LMMSE, as shown below: . Therefore, channel esti- Therefore, the final LMMSE estimation vector by applying the approximation banded approach becomes

LMMSE Estimation with SVD and Overlap Banded Technique
The banded submatrix utilization is not efficient at the edge subcarrier because of exploiting comparatively low correlated channel statistics for estimating the channel. To overcome this drawback, overlapped subcarriers are introduced, as in [29], to each small matrix as depicted in Figure 6a. We estimate the coefficients of the channel by the submatrices in the overlapped situation, and hence it is expected that the edge subcarrier performance will improve for all submatrices. We consider the overlap distance as S/2, and the total submatrices are R = (2L/S) − 1 for the overlapped strategy. Therefore, the channel autocorrelation matrix can be split into sections (b) and (c), as shown in Figure 6.

Summary of the Proposed Channel Estimation
The entire channel estimation process has been discussed adequately in the different sections above. The main goal of this work was to propose a low complexity and low power-hungry channel estimation technique which will be capable of keeping the demand of balancing between the performance and complexity of NB-IoT systems. In this work, we derive the proposed channel estimation scheme based on the optimum LMMSE channel estimator. We take the benefit of the high accuracy of LMMSE, and at the same time, we resolve the high complexity issue of the LMMSE by the application of SVD, banded, and overlap banded method while the performance is preserved at an acceptable level. To sum up, the detailed flow diagram of the proposed overlap banded SVD-based LMMSE channel estimation method is depicted in Figure 7.

Summary of the Proposed Channel Estimation
The entire channel estimation process has been discussed adequately in the different sections above. The main goal of this work was to propose a low complexity and low power-hungry channel estimation technique which will be capable of keeping the demand of balancing between the performance and complexity of NB-IoT systems. In this work, we derive the proposed channel estimation scheme based on the optimum LMMSE channel estimator. We take the benefit of the high accuracy of LMMSE, and at the same time, we resolve the high complexity issue of the LMMSE by the application of SVD, banded, and overlap banded method while the performance is preserved at an acceptable level. To sum up, the detailed flow diagram of the proposed overlap banded SVD-based LMMSE channel estimation method is depicted in Figure 7.

Architecture of Polyphase Interpolation Filter for the Downlink NB-IoT Systems
LMMSE channel estimation is based on an FIR filter method which contains a lower order [41]. The low order FIR filter structure remains unchanged for various channel situations and is the prime advantage of using it. However, this paper demonstrates the polyphase decomposition of the FIR filter of the conventional interpolation filter to boost the performance of the channel estimation and also reduce the complexity of the FIR filter design. Polyphase decomposition also constructs a low pass filter and minimizes the computational complexity of the system, and hence the power required will be reduced [42,43]. In this case, the large FIR filter length is decreased to several small sections to increase the efficiency of the interpolator filter. Every smaller filter will have an extent / .

Q F P =
Here, the FIR filter order F will be an integer multiple of polyphase . P In this case, P acts as an upsampler. We know that this upsampler inserts 1 P − zeros in between consecutive values of the incoming signal ( ) n x [43]. It is well explained that only Q number of values out of P values will be nonzero at a single instance. All the values are multiplied by the interpolator filter coefficients (0), ( ), (2 ),..., ( ). c c P c P c F P − That means for P polyphase decomposition, if the FIR filter order is F , the filter prototype can be illustrated according to [42] as shown below:

Architecture of Polyphase Interpolation Filter for the Downlink NB-IoT Systems
LMMSE channel estimation is based on an FIR filter method which contains a lower order [41]. The low order FIR filter structure remains unchanged for various channel situations and is the prime advantage of using it. However, this paper demonstrates the polyphase decomposition of the FIR filter of the conventional interpolation filter to boost the performance of the channel estimation and also reduce the complexity of the FIR filter design. Polyphase decomposition also constructs a low pass filter and minimizes the computational complexity of the system, and hence the power required will be reduced [42,43]. In this case, the large FIR filter length is decreased to several small sections to increase the efficiency of the interpolator filter. Every smaller filter will have an extent Q = F/P. Here, the FIR filter order F will be an integer multiple of polyphase P. In this case, P acts as an upsampler. We know that this upsampler inserts P − 1 zeros in between consecutive values of the incoming signal x(n) [43]. It is well explained that only Q number of values out of P values will be nonzero at a single instance. All the values are multiplied by the interpolator filter coefficients c(0), c(P), c(2P), . . . , c(F − P). That means for P polyphase decomposition, if the FIR filter order is F, the filter prototype can be illustrated according to [42] as shown below: where z −1 is the unit delay in the Z-transform notation. According to symmetry and polyphase characteristics, the LMMSE filter coefficient c(n) can be represented as c(F − n − 1) [44]. Hence, polyphase decomposition in each branch E(z) can be expressed as below: With the help of polyphase theory, every row of Equation (31) can be reduced as z −r H r (z P ) [42]. Here, H r represents the filter transfer function. Hence, Equation (31) can be represented in its compact form as given below: Figure 8 represents the proposed polyphase architecture of the interpolation filter according to the above equation. The incoming digital input stream x(n) is split into P polyphase or subbands through the transfer function of each polyphase section. As a result of placing the upsampler after the filter transfer function H r , it needs to perform Q multiplications and Q − 1 additions for each incoming bit sequence x(n). However, in the traditional interpolation filter, the upsampler is usually placed before the transfer function, and hence it needs PQ multiplications and P(Q − 1) additions to perform.
where 1 z − is the unit delay in the Z-transform notation. According to symmetry and polyphase characteristics, the LMMSE filter coefficient ( ) c n can be represented as ( 1 ) c F n − − [44]. Hence, polyphase decomposition in each branch ( ) z E can be expressed as below: With the help of polyphase theory, every row of Equation (31)

Complexity Analysis
By the application of SVD to the channel matrix, the rank of the channel autocorrelation matrix is reduced, and hence the algorithm complexity regarding multiplication and matrix inversion is significantly abridged. The partitioning of vectors into S equally spaced subvectors allows us to use the desired rank estimator to all vectors individually, which also reduces the bandwidth by the factor S with a certain performance loss, and it is also observed in our simulation. Hence, according to [40], the number of base vectors are reduced from L + This decreases the complexity of the estimator enormously. In the overlap banded technique, the multiplication for each subvector reduces

Complexity Analysis
By the application of SVD to the channel matrix, the rank of the channel autocorrelation matrix is reduced, and hence the algorithm complexity regarding multiplication and matrix inversion is significantly abridged. The partitioning of vectors into S equally spaced subvectors allows us to use the desired rank estimator to all vectors individually, which also reduces the bandwidth by the factor S with a certain performance loss, and it is also observed in our simulation. Hence, according to [40], the number of base vectors are reduced from L + 1 to L/S + 1. This decreases the complexity of the estimator enormously. In the overlap banded technique, the multiplication for each subvector reduces more by a factor S/2. Moreover, when the channel is estimated independently in each subsystem, the correlation among different subvectors is neglected, but the same MSE performance is obtained, which greatly decreases the complexity of the estimator. The calculated complexity of the LMMSE, LMMSE based on SVD, LMMSE based on banded SVD, and the LMMSE with overlap banded SVD are shown in Table 3. It is clear from the calculation that LMMSE possesses the highest complexity and the proposed LMMSE with overlap band technique has the lowest computational complexity. In the NB-IoT systems, memory utilization will play a vital role because of its very narrow bandwidth. If the algorithm consumes large memory, it will be impossible to manage with only 180 kHz bandwidth. From Table 3, we can also see that the memory space needed for the LMMSE is L which is the highest. When we apply the low rank SVD formulae, the memory space needed is shrunk because of the simpler matrix operation, and hence it is < L. In the proposed overlap banded SVD method, the memory space needed is S which is much smaller than L. Furthermore, as the polyphase approach in the interpolation filter uses resampling, each path frequency is minimized to F/P. Consequently, the number of multiplications and additions are reduced to Q and Q − 1 instead of PQ and P(Q − 1) of the conventional interpolation filter, respectively. Table 3. Complexity comparison.

Simulation Results
For the justification of the proposed algorithm, the 3GPP specified LTE-based downlink NB-IoT system is considered according to release 13. Link level simulation using MATLAB-2019a has been exploited to compare and validate the operation of the proposed technique with the conventional LS, LMMSE, and other derived versions of LMMSE. We have evaluated the performance of the downlink NB-IoT systems discussed in Section 3 using NRS (pilot aided) channel estimation technique by mean square error (MSE) and bit error rate (BER) of the channel along with throughput related to operating SNR as performance indicator. According to the NB-IoT communication scenario, TU 1 Hz channel model as in [21,45], and EPA 1 Hz channel model as in [28] are exploited for the performance investigations. In both channel scenarios, the Doppler drift is treated as 1 Hz to ensure the low mobility of NB-IoT terminals. For all the investigations, the performance of the proposed overlap banded SVD LMMSE channel estimation method is provided with the conventional and proposed interpolation method and compared with the state of art methods in the simulations for better understanding. The perfect channel estimation scenario of LMMSE (i.e., without noise and estimation error) named as theoretical LMMSE is also considered for the better realization compared to the other channel estimators. In this work, for all the simulations, the theoretical LMMSE is used as a lower bound for the performance of the MSE, BER, and instantaneous throughput.
For the link-level simulation, we have executed the system in single input and single output (SISO) mode with 180 kHz system bandwidth according to the 3GPP specification [36,38]. A 900 MHz carrier frequency with 7.68 MHz sampling rate is considered, while the CP length was set to 16 samples along with 512 FFT size. One LTE PRB (12 subcarriers) is used as NB-IoT UE subcarriers where the subcarrier spacing is 15 kHz (15 × 12 = 180 kHz). We have also used QPSK as the based band modulation with the modulation and coding scheme level (MCS = 0), transport block size (TBS = 16 bits), and 64 signal repetition for 256 ms transmission time. Hence, 64 repetitions of the similar signal are combined while decoding the signal at the receiver. NPDSCH timing and frequency synchronization error and hopping are not considered in our simulation because of keeping focus only on channel estimation. The simulation results under TU 1 Hz and EPA 1 Hz channels are explained and compared below.

Mean Square Error (MSE)
For the justification of the effectiveness and performance of the channel estimator, we have utilized the averaged channel MSE, which is defined at the n th repetition copy as where Ψ and Ψ are the actual and estimated channel values for all the estimators used in this paper. Furthermore, we have averaged N = 10 4 independent channel realizations to take all the MSE curves. In this case, the simulation SNR is set as −15 to 5 dB to ensure a noisy environment and deep penetration of coverage according to the features of NB-IoT systems. Figure 9a,b depict the MSE comparison of the proposed channel estimation technique for both conventional and proposed interpolation method with LS, theoretical and practical LMMSE, and its other counterpart versions for TU 1 Hz and EPA 1 Hz channel models. It is clear from Figure 9a,b that the LS estimator possesses the worst MSE among all techniques. It is very obvious that the LMMSE will have the best performance, which is also clear in our simulation. When SVD is applied on the channel matrix then we can see that the MSE performance is degraded by 1.2 dB SNR at 10 −3 from LMMSE in the TU 1 Hz channel. The reason is that the low-rank SVD estimator always possesses an irreducible error floor due to the part of the channel that does not include some specific subspace of the whole channel [30]. In this case, the space of the matrix dimension for time and frequency components leads the channel matrix to a low-rank approximation while still possessing an optimal performance by sacrificing a negligible performance degradation [30]. In the banded SVD estimator, we observe that the MSE is degraded by 2.2 dB at 10 −3 compared to the SVD LMMSE. The reason for the performance degradation is the utilization of less correlated components of the channel matrix for edge subcarriers. The MSE for the proposed overlap banded technique with the conventional interpolation overcomes this performance degradation by about 1.3 dB. With the application of the proposed interpolation method in the proposed overlap banded LMMSE channel estimation, this performance loss is recovered by almost 2.1 dB, which means it offers almost the same MSE performance of the SVD LMMSE. The overall MSE performance degradation caused by overlap banded SVD LMMSE from traditional LMMSE is about 2.1 dB, while traditional interpolation is used but offers a very low computationally complex algorithm. With the application of the proposed interpolation, the MSE performance loss is only 1.3 dB from LMMSE, and the computational complexity is further reduced. Especially, this attribute makes the proposed algorithm a good fit for NB-IoT systems compared to other concerned methods in this work. Similar performance is observed in EPA 1 Hz channel also. The received SNR of the MSE at 10 −3 is recorded in Table 4 for both TU and EPA channel models for better understanding and comparison. It is clear from Table 4 that the TU 1 Hz channel shows better performance than the EPA 1 Hz channel in the context of deep coverage (where SNR << 0), and the similar performance of this two channel models is also observed in [43]. With the application of the polyphase decomposed FIR filter interpolation method in the proposed overlap banded SVD LMMSE, almost 1 dB additional power is saved compared to the conventional interpolation method in both TU and EPA channel scenarios. In summary, the slight performance penalty of the proposed overlap banded LMMSE with the proposed interpolation method from the optimum LMMSE is only 1.3 dB and 1.1 dB for TU 1 Hz and EPA 1 Hz channels, respectively, which is nominal in the case of NB-IoT scenarios. Twenty subvectors (S = 20) were considered for this simulation for the case of overlap banded SVD LMMSE. Figure 10a which is nominal in the case of NB-IoT scenarios. Twenty subvectors (S = 20) were considered for this simulation for the case of overlap banded SVD LMMSE. Figure 10a,b show the comparison of MSE performances of the proposed overlap banded LMMSE technique with traditional and proposed interpolation methods for different subvector values (i.e., S = 10, 20, 30, and 40) to determine the optimum operating region of the channel estimator for both TU and EPA channels, respectively. The number of subvectors will be decided depending on the available bandwidth in the NB-IoT downlink. It is observed that the MSE performance increases up to a level for a certain number of increased subvector values. In each scenario, the proposed overlap banded SVD LMMSE channel estimation method with the proposed polyphase decomposed FIR filter interpolation exhibits better performance.

Bit Error Rate (BER)
In Figure 11a,b, we can observe the related BER comparison of the proposed technique with LS, traditional LMMSE, and its other counterpart versions with traditional and proposed interpolation schemes for TU 1 Hz and EPA 1 Hz channels, respectively. As usual, the LS estimator offers the worst, and the LMMSE offers the best BER performance. When the low-rank SVD is adopted, a minimal performance degradation of 0.80 dB is observed at 10 −2 than LMMSE in Figure 11a for the TU channel. Furthermore, in the case of banded SVD LMMSE, the performance is degraded more by about 3.3 dB because of using a low correlated channel matrix for the edge subcarriers. By using the overlap band technique with the traditional interpolation method, the optimization of the performance occurred at about 1.3 dB SNR at 10 −2 . With the application of the proposed interpolation scheme, the performance is improved by about 1.2 dB more, which is almost the same as the SVD LMMSE. This is the special feature of the proposed method, which can make a good balance among performance, complexity, and power consumption which is important for NB-IoT systems. Similar performance is observed for the EPA 1 Hz channel in Figure 11b also. The received SNR of the BER at 10 −2 is recorded in Table 5 for both TU and EPA channels for better understanding and comparison. From Table 5, we can see that an additional power of 1.2 dB and 0.9 dB can be saved with the application of proposed interpolation in the proposed overlap banded SVD LMMSE method while exploiting TU 1 Hz and EPA 1 Hz channels, respectively. In summary, the negligible performance loss of the proposed overlap banded SVD LMMSE with the proposed polyphase decomposed FIR filter interpolation method from the optimum LMMSE is only 0.8 dB and 0.9 dB for TU 1 Hz and EPA 1 Hz channels, respectively, which is acceptable in the case of the NB-IoT network. This simulation is obtained by considering twenty (S = 20) subvectors for the case of overlap banded SVD LMMSE. Figure 12a,b show the BER performance of the proposed overlap banded LMMSE channel estimation technique with traditional and proposed interpolation methods for different subvector values (i.e., S = 10, 20, 30, and 40) to determine the optimum operating region of the channel estimator for TU 1 Hz and EPA 1 Hz channels, respectively. It is observed that the BER performance increases up to a level for a certain number of increased subvector values and the overlap banded SVD LMMSE method always performs better with the proposed interpolation method compared to the traditional interpolation technique.

Throughput Analysis
The instantaneous throughput of the downlink NB-IoT UE for related operating SNR is also studied in our simulation. Here, the throughput is defined as the quantity of effectively received bits in the entire transmission period. Figure 13a,b represent the throughput performance comparison of the different channel estimators used in this paper for TU 1 Hz and EPA 1 Hz channels, respectively. The maximum sustainable throughput in NB-IoT downlink is 26.5 kbps [15,46,47]. In Figure 13a for the TU 1 Hz channel, we can see that the traditional LMMSE estimator possesses 25 kbps throughput at 15 dB SNR, which is almost close to the target downlink throughput, but it possesses high complexity and consumes excessive power, which is not suitable for the low complexity and low power NB-IoT systems. When the SVD technique is applied, the throughput performance degrades a little (24 kbps), which is also almost closer to the target data rate. Due to the uses of the channel statistics, those are less correlated to estimate the channel in the banded SVD LMMSE, the throughput performance degrades prominently (i.e., 19 kbps).

Throughput Analysis
The instantaneous throughput of the downlink NB-IoT UE for related operating SNR is also studied in our simulation. Here, the throughput is defined as the quantity of effectively received bits in the entire transmission period. Figure 13a,b represent the throughput performance comparison of the different channel estimators used in this paper for TU 1 Hz and EPA 1 Hz channels, respectively. The maximum sustainable throughput in NB-IoT downlink is 26.5 kbps [15,46,47]. In Figure 13a for the TU 1 Hz channel, we can see that the traditional LMMSE estimator possesses 25 kbps throughput at 15 dB SNR, which is almost close to the target downlink throughput, but it possesses high complexity and consumes excessive power, which is not suitable for the low complexity and low power NB-IoT systems. When the SVD technique is applied, the throughput performance degrades a little (24 kbps), which is also almost closer to the target data rate. Due to the uses of the channel statistics, those are less correlated to estimate the channel in the banded SVD LMMSE, the throughput performance degrades prominently (i.e., 19 kbps). When the overlap banded technique was applied to compensate for the performance loss of the channel estimator with the traditional interpolation method, it is observed that the throughput performance becomes 22 kbps. However, the application of the proposed overlap banded SVD LMMSE channel estimation method with the proposed polyphase decomposed FIR filter interpolator scheme increases the throughput up to 24 kbps which is almost the same as the SVD LMMSE technique. At the very low SNR situation, the same signal is repeatedly transmitted for having extended coverage and hence the transmission time becomes longer. Consequently, this situation degrades the throughput performance at the lower SNR values (e.g., at −10 dB SNR) to a few kbps. Similar performance is observed for the EPA 1 Hz channel in Figure 13b with average 1 kbps throughput degradation compared to the TU 1 Hz channel. Nevertheless, Figure 13a,b make it clear that the throughput performance of the proposed overlap banded SVD LMMSE technique with the proposed polyphase decomposed interpolation method is much better than the traditional LS, SVD LMMSE with banded and overlap banded methods while using the traditional interpolation method. Figure 14a,b show the throughput performance of the proposed channel estimation technique for different values of the subvector for both TU and EPA channels, respectively. Here, we can sum up that the maximum number of subvectors will give the best throughput performance. the proposed polyphase decomposed interpolation method is much better than the traditional LS, SVD LMMSE with banded and overlap banded methods while using the traditional interpolation method. Figure 14a,b show the throughput performance of the proposed channel estimation technique for different values of the subvector for both TU and EPA channels, respectively. Here, we can sum up that the maximum number of subvectors will give the best throughput performance.

Conclusions
In this paper, a brief overview of the NB-IoT downlink system along with NRS signal generation and mapping are presented. According to the channel impairments and multipath fading, an analytical signal model for the downlink NB-IoT systems is derived. To validate the operation of low power and low complexity NB-IoT systems, two channel models (TU 1 Hz [21,45] and EPA 1 Hz [28]) are exploited, which are especially recognized for low power communication systems design and validation. Consequently, LMMSE is

Conclusions
In this paper, a brief overview of the NB-IoT downlink system along with NRS signal generation and mapping are presented. According to the channel impairments and multipath fading, an analytical signal model for the downlink NB-IoT systems is derived. To validate the operation of low power and low complexity NB-IoT systems, two channel models (TU 1 Hz [21,45] and EPA 1 Hz [28]) are exploited, which are especially recognized for low power communication systems design and validation. Consequently, LMMSE is used, followed by the application of the optimal rank reduction formula (SVD) to the channel matrix to get a low-rank estimator. Then, the channel matrix is subdivided into several submatrices to reduce the complexity again. Finally, a reduced complexity LMMSE channel estimation technique is proposed by overlapping the submatrices named overlap banded SVD LMMSE for the NB-IoT downlink systems. In addirion, for improving the performance and complexity of the proposed channel estimation once more, a new interpolation method is presented with the application of polyphase decomposition of the FIR filter of the traditional interpolation method. Theoretical derivation and the link level simulations proved that the MSE, BER, and the instantaneous throughput performance related to the SNR of the proposed overlap banded SVD LMMSE technique with the proposed polyphase decomposed FIR filter interpolation method are almost close to the LMMSE method, and it significantly outperforms the proposed overlap banded SVD LMMSE channel estimator with the traditional interpolation scheme and LS estimator for both TU 1 Hz and EPA 1 Hz channel scenarios. Furthermore, the complexity calculation and the memory space required confirm that the overlap banded SVD LMMSE channel estimator with the proposed interpolation scheme offers less complexity than the LMMSE approach. Hence, it possesses an expected balance between performance and complexity, which is crucial to the operation of NB-IoT systems. This attribute of the proposed channel estimation method makes it superior to other related schemes to be applied in NB-IoT systems. In conclusion, by compromising negligible performance from the optimum LMMSE channel estimator (MSE of only 1.3 dB and 1.1 dB for TU 1 Hz and EPA 1 Hz channels while BER of 0.8 dB and 0.9 dB for both channel cases, respectively), the proposed estimator, consisting of very low complexity and small memory demand seems compatible and can be successfully implemented in the low power and low complexity NB-IoT systems. For further studies in the future, the authors will conduct more link-level simulations under different AWGN channel models and will focus on equalization based on LMMSE for the NB-IoT downlink systems.