A Survey on Optimal Channel Estimation Methods for RIS-Aided Communication Systems

: Next-generation wireless communications aim to utilize mmWave/subTHz bands. In this regime, signal propagation is vulnerable to interferences and path losses. To overcome this issue, a novel technology has been introduced, which is called reconﬁgurable intelligent surface (RIS). RISs control digitally the reﬂecting signals using many passive reﬂector arrays and implement a smart and modiﬁable radio environment for wireless communications. Nonetheless, channel estimation is the main problem of RIS-assisted systems because of their direct dependence on the system architecture design, the transmission channel conﬁguration and methods used to compute channel state information (CSI) on a base station (BS) and RIS. In this paper, a concise survey on the up-to-date RIS-assisted wireless communications is provided and includes the massive multiple input-multiple output (mMIMO), multiple input-single output (MISO) and cell-free systems with an emphasis on effective algorithms computing CSI. In addition, we will present the effectiveness of the algorithms computing CSI for different communication systems and their techniques, and we will represent the most important ones.


Introduction
The main goal of the sixth generation (6G) wireless technologies is to upgrade communication quality. Since the number of mobile devices is rapidly increasing, the issue concerning the efficient propagation of data at their distribution nodes unavoidably arises. For this reason, it is essential to study the improvement of bandwidth, system capacity and reliability, as well as quality of service (QoS). The percentage of energy consumption, carbon footprint and hardware costs are also considered [1]. In addition, the mmWave/subTHz band is already used in current research activity, which provides a large gain in the performance of 6G technology communication systems, although it is quite sensitive to physical obstacles and/or propagation attenuation effect. To avoid the above problems, the technology of RISs has emerged [2].
Specifically, an RIS consists of a surface designated by software that is typically connected to a prepared set-up system which mitigates blocking effects. It consists of many separately controllable, low-cost, and quasi-passive elements. The RIS succeeds in adapting to alterations in the propagation environment and modulating the radio waves, as each element can cause an adaptable phase shift to each detected signal, in order to enable a dynamic control over the wireless propagation channel [3]. Therefore, the RIS does not need complex signal processing units, as it is able to create a virtual path between the transmitter and receiver. Hence, throughput is improved, and a higher performance is achieved [2]. The contribution of the RIS is represented in [4] for reducing transmitted power at the BS using signal-to-interference-plus-noise ratio (SINR) constraints in multi-user MISO (MU-MISO) RIS-assisted systems for the joint improvement of the precoder and reflecting beamforming matrices (RBMs); in [5], the total rate reached the maximum under the transmit power limitation; in [6], the total rate was improved by considering the associated attenuation of Rayleigh waves and the usual hardware faults in both the transceiver and RIS; in [7], reaching the maximum of the minimum user equipment (UE) rate was investigated in the case of a large amount of antennas; in [8], the effect of hardware faults was assessed; in [9], the effect of incomplete CSI in the possibility of shutdown was demonstrated. Furthermore, there are discussions on how the technology of RIS contributes to the efficient transmission of energy consumption on the internet of vehicle (IoV) networks [10] and in passive meta surface-coated devices [11].
In addition, a new RIS structure, which reflects and absorbs the transmitted signal through an absorbing agent equipped with radio frequency chains, performs signal processing techniques, configures its phase shift matrix, and does not need incoming signal control [2]. It is called a hybrid RIS (HRIS) and research has been presented in [12][13][14][15]. In [16], concerning sparse antenna activation based on a channel extension strategy in a HRIS, a full one-hop channel optimization and beam searching scheme design for inclusion in channel subsampling are proposed.
Beamforming (i.e., precoding in transmission and combining in reception) is a key signal processing technique to ensure reliable communication between the transmitter and receiver. Although conventional MIMO systems widely consider single-stage all-digital beamforming, two critical challenges for mMIMO systems arise: high hardware costs due to a single dedicated radio frequency (RF) chain per antenna and large channel estimation overheads [17]. In several studies, such as [18][19][20][21], the hybrid beamforming (HBF) solution is proposed, which generates the RF stage through slow time-varying angular information and the digital baseband (BB)-stage through reduced-dimensional CSI. HBF has been studied, mainly in RIS-assisted mMIMO systems, to achieve the requirements for full CSI estimation [22,23]. In [22], the spectral performance is improved by the joint optimization of the RIS phase shift and hybrid precoder/compressor two-stage algorithm, while cost and energy consumption are reduced. In [24], the HBF scheme is used in a mmWave RIS-assisted broadband MIMO system with geometric mean beamforming decomposition in order to avoid complex bit/power allocation, while the authors in [23], designed a joint RIS phase shift matrix and HBF with a direct channel between the transmitter and receiver.
To make use of the advantages promised by the RIS, accurate CSI is essential, although it is difficult to achieve because a complex signal processing capability of the RIS [25] is missing. Furthermore, the RIS consists of passive components, which do not process signals and do not evaluate the UE-RIS and RIS-BS channel separately, but sequentially (i.e., the cascaded end-to-end channel). When there are many BS antennas and RIS reflectors, the number of sequential channel coefficients will be large; and so will be the number of pilots [26]. This causes a problem for channel estimation in any communication system. Other parameters that affect the channel estimation are the distance of the users from the RIS, the type of communication system used, but mainly the algorithm used for the channel estimation. We cannot assume which algorithm is the best for channel estimation in every RIS-assisted system architecture because each system has different requirements. Hence, there has been great contributions on channel estimation for RIS-assisted communication systems, such as those in [1,4,17,26,27].
The purpose of the current research is to provide an inclusive and up-to-date survey of papers in RIS-assisted wireless communications, such as MIMO and MISO system communications. Emphasis is given on the practical challenges of BS, RIS and user channels to estimate their value using optimized algorithms with different channel models and system configurations. This paper presents several methods to optimally solve critical issues of channel estimation, such as pilot overhead reduction. The complexity of these methods is presented and the communication systems in which they can be used are selected according to their positive outcome. In addition, the challenges of the proposed methods are mentioned in order to be handled in future research.
The rest of the paper is organized as follows: Section 2 presents system models for channel estimation in RIS-assisted systems with different system architectures. Section 3 presents optimal channel estimation algorithms using statistical CSI (S-CSI) or instantaneous CSI (I-CSI). Section 4 presents the results of the proposed optimal algorithms per communication system, and some concluding remarks are drawn in Section 5.
Notations: x is a scalar, x is a vector, X represents a matrix, and A denotes a set. a 0 , a 2 , A l 1 and A F denote 0 , 2 , 2 and the Frobenius norm, respectively.
is the Kronecker product.
[a] i , [A] i,j and [A] i,j,k are the i-th element of vector a, the (i, j)-th element of matrix A and the (i, j, k)-th element of tensor A, accordingly. The notations (·) T , (·) H and tr(·) each stand for the transpose, Hermitian transpose and trace operators. The E[·] is the expectation operator and diag(a) denotes an n × n diagonal matrix that consists of the elements of vector a. b ∼ CN (0, Σ) is the symbol for a circularly symmetric complex Gaussian vector with zero mean and covariance matrix Σ, λ is the wavelength, ω is the discount factor, S is the finite set of states, A is the finite set of actions, θ is the vector of policy parameters and θ old denotes the vector of policy parameters before the update; π θ is the policy neural network with parameters denoted by θ and π : S × A → [0, 1] , a t and s t are the action and state of given time step t, respectively, with a t ∼ π θ (a t |s t ), r : S → R is the reward function, ϕ n is the phase shift introduced by the n-th element of the RIS, υ is the velocity of the user, c = 3 × 10 8 m/s is the speed of light and f c is the carrier frequency. Finally, I M stands for the M × M identity matrix. Table 1 summarizes the latest research on channel estimation in different RIS-assisted system settings. We noticed that most research has been carried out in narrowband for one or more users and channel estimation has been accomplished in cascaded channels. For the remaining categories, more research is needed.  [10] and in passive meta surface-coated devices [11].

System Models for RIS-Assisted Systems
From Table 1 we observed that there are many different methods on how the channel can be estimated in RIS-assisted MISO and MIMO systems. For this reason, we searched and present below additional research on deep learning techniques, high user mobility scenarios, estimation in cascaded channels, use of neural networks and ZF detection techniques. We focused on these scenarios because most of the literature uses and develops them. In addition, we present other methods, such as tensor algorithms, which are used in more recent literature.

MISO Systems
The authors in [71,72] use S-CSI design passive beamformers for MISO systems and a purely S-CSI-based approach for the design of RIS-assisted multi-user downlink systems is proposed in [4]. Furthermore, in [73], the authors use a tensor algorithm and capitalize on the parallel factor (PARAFAC) decomposition to create an efficient iterative algorithm based on the alternating least squares (ALS) concept to answer the channel estimation issue in the downlink of a MISO network. For this reason, an approximate expression is established to employ the sum rate and formulate an active-passive joint beamforming design problem. Because of the non-convex constraint, the objective function is intractable and complicates the relationship between the BS transmit beamformer and RIS passive beamformer. In [27], the authors solve this optimization problem by using a proximal policy optimization (PPO) algorithm, which is a powerful reinforcement learning (RL) algorithm from the actor-critic family and outperforms other actor-critic algorithms. In the following subsection, we present the system and channel models for the considered RIS-assisted MISO communication systems, which are similar to those in [71][72][73][74].

System Model
The system and channel models for the considered RIS-assisted MISO communication systems are similar to those in [74]. Specifically, Figure 1 depicts an RIS-assisted MU-MISO system with M antennas at the BS, communicating with K users, and the RIS is equipped with J antennas for signal transmission. Each j-th reflective element of the RIS has a reflective coefficient ξ j and the reflectance matrix of the RIS panel is depicted as Ξ = diag(ξ 1 , . . . , ξ j ), where ξ j = e jϕ j with ϕ j denoting the phase shift of the j-th element of the RIS. All channel models follow the Rician distribution [74]. h k,0 ∈ C M×1 is the direct channel from BS to the k-th UE, H 1 ∈ C J×M that from the BS to RIS and h k,2 ∈ C J×1 that from the RIS to the k-th UE. The corresponding effective channel from the transmitter to the k-th UE would be h T k h T k,0 + h T k,2 ΞH 1 [74].
The channels between the BS and RIS, and the direct one from the BS to the -th receiver are depicted as [75][76][77]  The channel matrix h k,2 between the RIS and k-th UE is given as [74][75][76] where δ k,2 is the distance dependent path-loss factor, κ k,2 is the Rician factor between the RIS and the k-th UE. Furthermore, is the azimuth and elevation angle of departure (AoD) from the RIS to the k-th UE, respectively, and is the array response vector at the RIS side. For a RIS ϕ is the assumed uniform planar arrays (UPAs) and is given by where D is the distance between antenna elements and J = J H J V , where J H and J V are the number of elements at the horizontal and vertical axes, accordingly [74].
The channels between the BS and RIS, and the direct one from the BS to the k-th receiver are depicted as [75][76][77] where δ 1 is the distance dependent path-loss, κ 1 is the Rician factor of the BS-RIS link. δ u,0 and κ u,0 stand for the distance dependent path-loss and the Rician factor between BS and the k-th UE link, correspondingly [74]. In addition, where ϕ (RIS) and ψ (RIS) are the azimuth and elevation angle of the angle of arrival (AoA) to the RIS and ϕ (BS) and ψ (BS) express the azimuth and elevation AoD from the BS to the RIS direction. On the contrary, are the azimuth and elevation AoD from the BS in the direction of the k-th UE [74].
Furthermore, there is no spatial correlation among the antennas and the distribution of the non-line-of-sight (NLoS) component of the channels ∼ h k,2 , ∼ h k,0 and ∼ H 1 are independently and identically distributed complex Gaussian random variables with zero-mean and unitvariance [74].
Finally, the obtained signal for the k-th UE y k ∈ C is equal to where P k denotes the allocated power and x k is the signal for k-th UE, f k is the beamforming vector for the k-th UE, n k ∈ C represents a circularly symmetric complex additive Gaussian noise with E n k n H k = σ 2 [74].

Problem Formulation
In the problem formulated by the authors in [74], it is assumed that the transmitter is not able to acquire CSI. Therefore, the trouble to reach the maximum total rate is based on the channel statistics, i.e., the information angle and Rician factors, through the recursive linkage [78]. Following the mathematical calculations, the authors in [74] arrive at the solution of the following optimized problem to be a function of the distance-dependent path loss, the Rician factors and other channel statistics, and the attenuation effect has been averaged on a small scale, using the proposed PPO algorithm.

MIMO Systems
Studies in [1][2][3]17,25,26,69,70,79,80] focus on creating optimal channel estimation algorithms in RIS-assisted mMIMO and MIMO systems. In [3], the realistic characterization of the achievable downlink of the RIS-assisted mMIMO systems is presented, accounting for user mobility and I-CSI under correlated Rayleigh fading conditions, when regularized ZF (RZF) precoding is applied. In [25], a two-stage strategy is used for the estimation of cascaded uplink channels without using a standard user for the RIS-assisted MU mmWave system. In [26], an efficient three-stage channel estimation method with a low pilot overhead is presented in an RIS-assisted single-antenna MU mmWave communication system, and the BS, RIS and UE are equipped with UPA. In [17], an RIS angular-based hybrid beamforming (AB-HBF) system requires low CSI overhead for the mmWave mMIMO system with a mmWave channel model based on 3D geometry of three stages: (i) RF beamformers, (ii) BB precoder/combiner, and (iii) RIS phase shift design. In [1], the authors use a tensor algorithm for the joint estimation of involved channels and imperfections in an RIS-assisted MIMO system. In [69,70], the authors create phase shift design schemes in uplink with linear ZF detection in the receiver side of an RIS-assisted MIMO system. Furthermore, in [81,82], the authors develop simple iterative and closed-form channel estimation algorithms based on the PARAFAC modeling of the single-user MIMO scenario. Finally, in [79,80], the researchers focus on a cell-free RIS-assisted system architecture where channel estimation is achieved sequentially. Next, we present a cell-free and a cell RIS-assisted mMIMO mmWave system.

Cell-Free Communication System
• Scenario Caption and Signal Model Figure 2 illustrates an RIS-assisted single user cell-free communication system. Based on [80], they assume a time division duplex (TDD) mMIMO system, where M BSs with the assistance of J RISs serve K UE. The m-th BS is equipped with N m antennas and each RIS has N j reflective elements, respectively, where m = 1, 2, . . . , M and j = 1, 2, . . . , N. In this model, the RISs function as antenna arrays far away from the BS to improve capacity and provide greater coverage at a low cost, to enhance cell-free communications. In addition, a central processing unit (CPU) facilitates the joint signal processing of multiple BSs. The RISs are managed by the wire to the BSs or CPU.

Cell-Free Communication System
• Scenario Caption and Signal Model Figure 2 illustrates an RIS-assisted single user cell-free communication system. Based on [80], they assume a time division duplex (TDD) mMIMO system, where BSs with the assistance of RISs serve UE. The -th BS is equipped with antennas and each RIS has reflective elements, respectively, where = 1, 2, …, and = 1, 2, …, . In this model, the RISs function as antenna arrays far away from the BS to improve capacity and provide greater coverage at a low cost, to enhance cell-free communications. In addition, a central processing unit (CPU) facilitates the joint signal processing of multiple BSs. The RISs are managed by the wire to the BSs or CPU. Quasi-static block-fading channels are considered; the channel matrices from theth BS to the -th RIS are represented by ∈ ℂ × and the channel matrices from theth RIS to the -th UE are represented by ℎ ∈ ℂ 1× . For the -th RIS, the reflection coef- is the reflection coefficient of the -th element of the -th RIS, , = 1 and , are the amplitude and phase, accordingly. The channel matrix from the -th BS to the -th UE affected by the -th RIS is ℎ . Additionally, the RIS has the same number of reflection elements, i.e., 1 = 2 = … = = . The downlink CSI can be achieved by the uplink channel estimation due to TDD systems. To estimate the uplink channel, the orthogonal pilot sequences and reflection coefficients for the channel measurement should be designed. It also consists of subframes Q for RISs, and each subframe contains symbol durations T, where ≥ [80].
The reflection vector of the -th RIS in the -th sub-frame is equal to , the pilot sequence of the -th UE is = [ ,1 , ,2 , … , , ] ∈ ℂ 1× and = 0, ≠ , = 2 , where 2 is the transmit power of each UE. The channel that directly links the BS and the UE considers all RISs to be disabled and the direct channel estimation is not considered. In the -th sub-frame, the received signal ∈ ℂ × at the -th BS is written as Quasi-static block-fading channels are considered; the channel matrices from the m-th BS to the j-th RIS are represented by F mj ∈ C N j ×N m and the channel matrices from the j-th RIS to the k-th UE are represented by h H jk ∈ C 1×N m . For the j-th RIS, the reflection coefficient T and υ j,l = p j,l e jθ j,l is the reflection coefficient of the l-th element of the j-th RIS, ρ j,l = 1 and θ j,l are the amplitude and phase, accordingly. The channel matrix from the m-th BS to the k-th UE affected by the j-th RIS is h H jk V j F mj . Additionally, the RIS has the same number of reflection elements, i.e., L 1 = L 2 = . . . = L N = L. The downlink CSI can be achieved by the uplink channel estimation due to TDD systems. To estimate the uplink channel, the orthogonal pilot sequences and reflection coefficients for the channel measurement should be designed. It also consists of subframes Q for RISs, and each subframe contains symbol durations T, where T ≥ K [80].
The reflection vector of the j-th RIS in the q-th sub-frame is equal to v q j , the pilot where σ 2 ρ is the transmit power of each UE. The channel that directly links the BS and the UE considers all RISs to be disabled and the direct channel estimation is not considered. In the q-th sub-frame, the received signal Y m,q ∈ C N m ×T at the m-th BS is written as where W m,q ∈ C N m ×T represents the incoming Gaussian noise following CN 0, σ 2 j I N m [80]. In [80], since the orthogonal pilot sequences are employed and considering the low mobility scenarios for the k-th UE, we obtain ∼ y m,q,k = 1 where ∼ W m,q = 1 PT W m,q . By collecting the signals of the Q sub-frames via raw stacking, the received signal • Channel Model of Cell-Free Communication System In [80], the BSs and RISs are supplied with a uniform linear array (ULA), and the physical channels defining the geometric arrangement are where P f ,m,n and P h,j,k are the number of paths of F mj and h jk , accordingly, β m,j,p and γ j,k,b are the complex gains of the p-th and b-th paths of the two channels, accordingly. Additionally, ϕ m,j,p and ϕ j,k,b are AoDs from the m-th BS to the j-th RIS and from the j-th RIS to the k-th UE, accordingly. θ m,j,p is the AoA from the m-th BS to the j-th RIS. d BS = d RIS = d is the antenna inter-element distancing. a D (·) and a L (·) are ULA steering vectors of BSs and RISs, accordingly. Not losing the generality, we have where L is the number of antenna elements. Figure 3 shows an RIS-assisted downlink cell mMIMO communication system [3], where the BS has M antennas in communication with the K single-antenna non-cooperative UEs behind the obstacles. The RIS has N j reflecting elements, where j = 1 and 2, . . . , N RIS is in the LoS of the BS to assist the contact with the UEs. The same scenario was considered where the BS and RIS are installed at a high altitude and with fixed locations. The size of each RIS element is dH × dV, where dV and dH stand for its vertical height and its horizontal width, accordingly. The suggested model also examines the potential presence of direct links between the BS and the UEs. •

Channel Model
A pseudo-static attenuation model with a larger bandwidth, rather than the channel bandwidth, is calculated. They generate a typical block attenuation model with each coherence interval/block included, = for the channel usage, is the amplitude coherence (Hz) and is the time coherence (s). Inside the transmission of each relevance block and during the -th time slot, let 1 = [ℎ 11 , … , ℎ 1 ] ∈ ℂ × and ℎ 2 , ∈ ℂ ×1 be the LoS channel between the BS and the RIS and the channel between the RIS and -thUE at the -th time instant. Consider that ℎ 1 for = 1, …, expresses the -th column vector of 1 . , ∈ ℂ ×1 is the direct channel between the BS and -th UE at the -th moment. The greater part of existing studies e.g., [4,5], assumed the independent Rayleigh model, but, compared fading appeared and changed the performance [83]. ℎ 2 , and , are depicted in terms of related Rayleigh fading distributions as where , ∈ ℂ × and , ∈ ℂ × stand for the deterministic Hermitian-symmetric positive semi-definite correlation matrices at the RIS and the BS accordingly, with ( , ) = 1 and ( , ) = 1 given. Furthermore, ℎ 2 , expresses the path-losses of the RIS UE and , is defined as the path-losses of the BS UE links. Particularly, , is supposed to be small because of the blockages between the BS and the UEs. Furthermore, 2 , ~ (0, 1 ) and , ~ (0, 1 ) indicate the matching fast-fading vectors at the -th moment. Notice that fast fading vectors shift within each coherence block, while the parallel matrices are believed to be continuous for many coherence blocks [3]. The high level LoS channel 1 is illustrated as where 1 is the path-loss between the BS and RIS, while and are the inter-antenna partition at the BS and inter-element partition at the RIS, accordingly [49]. Moreover, 1, and 1, represent the elevation and azimuth line of sight (LoS) AoD at the BS regarding the RIS reflecting element , 2, and 2, represent the elevation and azimuth LoS AoA at the RIS. It is useful to point out that 1 can be acquired similarly to the covariance matrices since the dependence of their statements on the distances and the angles is related [3].

• Channel Model
A pseudo-static attenuation model with a larger bandwidth, rather than the channel bandwidth, is calculated. They generate a typical block attenuation model with each coherence interval/block included, τ c = B C T C for the channel usage, B C is the amplitude coherence (Hz) and T C is the time coherence (s). Inside the transmission of each relevance block and during the n-th time slot, let H 1 = [h 11 , . . . , h 1L ] ∈ C M×N j and h 2k,j ∈ C N j ×1 be the LoS channel between the BS and the RIS and the channel between the RIS and k-thUE at the n-th time instant. Consider that h 1i for i = 1, . . . , N j expresses the i-th column vector of H 1 . g k,j ∈ C M×1 is the direct channel between the BS and k-th UE at the n-th moment. The greater part of existing studies e.g., [4,5], assumed the independent Rayleigh model, but, compared fading appeared and changed the performance [83]. h 2k,j and g k,j are depicted in terms of related Rayleigh fading distributions as h 2k,j = β h 2k,j R RIS,k q 2k,j (13) g k,j = β g,k R BS,k q k,j , where R RIS,k ∈ C N j ×N j and R BS,k ∈ C M×M stand for the deterministic Hermitian-symmetric positive semi-definite correlation matrices at the RIS and the BS accordingly, with tr R RIS,k = N j1 and tr R BS,k = M 1 given. Furthermore, β h 2k,j expresses the path-losses of the RIS kUE and β g,k is defined as the path-losses of the BS kUE links. Particularly, β g,k is supposed to be small because of the blockages between the BS and the UEs. Furthermore, q 2k,j ∼ CN 0, I N j1 and q k,n ∼ CN (0, I M1 ) indicate the matching fast-fading vectors at the n-th moment. Notice that fast fading vectors shift within each coherence block, while the parallel matrices are believed to be continuous for many coherence blocks [3]. The high level LoS channel H 1 is illustrated as where β 1 is the path-loss between the BS and RIS, while d BS and d RIS are the inter-antenna partition at the BS and inter-element partition at the RIS, accordingly [49]. Moreover, θ 1,N j and ϕ 1,N j represent the elevation and azimuth line of sight (LoS) AoD at the BS regarding the RIS reflecting element N j , θ 2,m and ϕ 2,m represent the elevation and azimuth LoS AoA at the RIS. It is useful to point out that H 1 can be acquired similarly to the covariance matrices since the dependence of their statements on the distances and the angles is related [3].
The N j2 elements are defined by the diagonal RBMs Θ = diag µ 1 e jθ 1 , . . . , µ J e jθ N j2 ∈ C N j ×N j , where θ N j2 ∈ [0, 2π] and µ N j2 ∈ [0, 1] are the phase and amplitude coefficient for the RIS element N j , accordingly. Maximum signal reflection is, i.e., µ L2 = 1 ∀ N j [4]. The overall channel vector h k,j = g k,j + H 1 Θh 2k,j , conditioned on Θ is delivered as Because R k depends on the path-losses, the correlation matrices can also be identified by the network and H 1 , which are all expected to be known as clarified earlier [3].

•
Channel Aging In practice, the UE mobility causes channel aging phenomenon [84][85][86][87]. In [3], authors assume that the RIS's components move the same rating to a unique UE and this results in a Doppler shift that changes channel with time. Therefore, unlike the conventional block fading channel model, the flat fading channel coefficients differ from symbol to symbol. Still, they are fixed within a symbol. The symbol duration is expected to be less than or even to the coherence time of all UEs. Studies on the impact of channel aging assume the same in [84][85][86][87]. The channel use is represented by n ∈ {1, . . . , τ c } [3].
Arithmetically, the channel h k,j at the n-th time instant is denoted as a function of its initial state h k,0 . The advanced component is [87] where e k,j ∼ CN (0, R k ) is the separate innovation component at the n-th time instant and α k,j = J 0 (2πf D T s n) is the sequential correlation coefficient of UE k in the middle of the channel at time 0 and k with J 0 (·) are the zeroth-order Bessel function of the first kind, T s is the channel testing duration, f D = υ f D c is the maximum Doppler shift and − a j,k = 1 − α 2 k,j [3].

Channel Estimation in the RIS-Assisted Communication Systems
In this section, we will present some optimized algorithms proposed for channel estimation in the MISO and MIMO model systems for RIS-assisted systems. First, the utility of S-CSI and I-CSI is explained.

S-CSI and I-CSI
In [9], it has been proven that large-scale antenna systems may be impaired from pilot contamination impact with many antennas. In frequency division duplex (FDD) systems, the BS accesses the CSI using a feedback channel. When users have high mobility, S-CSI is used, which varies more slowly than I-CSI [88]. The BS receives feedback for a longer time to reduce the amount of feedback [74]. The RIS-assisted communication systems are often hampered by the difficulty of obtaining accurate I-CSI [89]. Therefore, there are several studies on I-CSI in [90,91] and channel reciprocity in TDD systems and uplink training at the BS have been exploited, respectively. S-CSI has a much more balanced quantity and changes very gradually. First, channel statistics are collected and updated in sufficient time. Furthermore, signal exchange overhead and computation overhead can be decreased because the state of the RIS does not need to change often. Doing so, signal exchange overhead and computation overhead can be reduced, and S-CSI is used to verify RIS coefficients over an extended period. For all of these reasons, the CSIs of the BS-RIS and RIS-UEs links must be punctual to optimize the reflection coefficients of RIS-assisted systems. In addition, CSI can be difficult to determine, due to the large number of passive reflective elements in BS-RIS and RIS-UEs links. As a result, it is necessary to provide low training loads while maximizing the operation profits provided by the RIS. Overall, S-CSI provides an efficient direction in system design because it varies relatively slowly and can be obtained quite easily [74,75].

MISO Systems
In this section, we present the optimized algorithms, which use different techniques for the channel estimation in MISO systems.

Alternating Optimization with the Semidefinite Relaxation (SDR) Technique
In [4], the transmit beamforming at the BS is designed together with the phase shifts at the RIS based on all BS-RIS, UE-RIS and UE-BS channels, to fully realize the network beamforming gain. They apply the semidefinite relaxation (SDR) technique to avoid the non-convex SINR limitations, as well as the signal unit-modulus limitations imposed by the MU passive phase shifters. Furthermore, to reduce the computational complexity, an efficient algorithm is used based on the alternating optimization of the phase and transmission shifts of the beamforming vector in an iterative manner, where their optimal solutions are derived in closed form with the other being fixed. For more details see Algorithm 1 in [4].

PPO Algorithm
As we mentioned in Section 2, the authors of [74] use the PPO algorithm on the RISassisted MU-MISO system to solve optimization problem P1. PPO is presented in [27] and has been shown to outperform other benchmark algorithms, as it is easy to set up, useful and easy to execute, and is a model-free, policy-based, actor-critic strategy ranking method.
Think about an infinite-horizon discounted Markov decision process (MDP), specified by the tuple (S, A, P, r, ω), where P: S × A × S → R is the transition probability distribution and finally, ω ∈ [0, 1]. The purpose of PPO is to maintain the consistency of the trust region policy optimization (TRPO) algorithms because they guarantee monotonic improvements considering the Kullback-Leibler (KL) variance of policy updates using first-order optimization methods. The TRPO increases the objective purpose and limits the size of the policy update [74]. Specifically, subject toÊ KL π θ old (.|s t ), π θ (.|s t ) ≤ δ, with the expectationÊ t [.] showing the observed average over a fixed batch of samples, in an algorithm that varies between sampling and optimization. Furthermore,Â t is an estimator of the gain function at timestep t and is given by where Q(., .) and V(.) are the action-value of the value functions, accordingly, and are specified as follows V(s t )= E a t ,s t+1 ,...
where s t+1 ∼ P(s t+1 |s t , a t ). The valuation of the benefit task in the interval t ∈ [0, T] is given by [27]Â with δ t = r t + ωV(s t+1 ) − V(s t ) and κ is a hyperparameter and represents the factor for the trade-off of bias and variance for the generalized advantage estimator (GAE) [74].
Then, let ρ t (θ) signify the probability ratio ρ t (θ) = π θ (α t |s t )/π θold (α t |s t ), then obviously ρ t (θ old )= 1, hence, according to (17), the TRPO maximizes where CPI is the conservative policy iteration. The major problem of optimization in (22) is the probability of a large policy update. If π θold π θ then ρ t (θ) becomes very large and policies change dramatically. To unravel this issue, PPO alters the objective function in (22), as follows where ε c is a hyperparameter. The simplification of the objective function (22) was carried out because using a small set of experiment sizes; the algorithm is not too acquisitive in favoring actions with positive benefit functions, nor too fast in preventing actions with negative benefit functions. The PPO is an MDP with observation and action spaces. When resolving the joint active and passive beamforming problem on the BS, the RIS and UE of the MISO system are declared in the E environment, and the agent is the BS that controls the RIS. For more details, see Algorithm 1 in [74].

Pseudocode of Asynchronous One-Step Q-Learning
In [92], the authors propose an RL algorithm in an RIS-assisted MU-MISO system, where instead of repeating experience, they asynchronously execute multiple agents in parallel over multiple instances of the environment. This parallelism likewise decouples the agents' data into a more stationary procedure, because at each time step, the parallel agents experience a variety of different states. This allows a wider range of fundamental RL algorithms within policy, such as critical action methods, and non-policy RL algorithms, such as Q-learning, to be robust and efficiently implemented using deep neural networks.
They believe the standard RL setting, where an agent relates with environment E over a few discrete time steps. At each time step t, the agent obtains a state s t and takes an action from some set of possible actions A, according to its policy π. In response, the agent obtains the next state s t+1 and obtains a scalar reward r t . The method remains until the agent achieves a terminal state, after which the procedure restarts. The return R t = ∑ ∞ k=0 ω k r t+k is the total accumulated return from time step t with ω ∈ (0, 1]. The purpose of the agent is to improve the expected return from each state s t [92]. In value-based model-free RL methods, the action value function is characterized using a function approximator, such as a neural network. Let Q(s, α; θ) be an approximate action-value function with parameters θ. The updates to θ are obtained from a variety of RL algorithms. One example of such an algorithm is Q-learning, which directly estimates the optimal action value function: Q * (s, α) ≈ Q(s, α; θ). In one-step Q-learning, the parameters θ of the action value function Q(s, α; θ) are studied by iteratively minimizing a sequence of loss functions, where the i-th loss function is described as where s is the state encountered after state s [92].
The above method is mentioned as one-step Q-learning because it renews the energy value Q(s, α) to the one-step return r + ωmax a Q(s , α ; θ i ). A disadvantage of utilizing one-step methods is that acquiring a reward r only immediately impacts the value of the pair of state actions s that led to the reward. The values of the other pairs of state actions are only indirectly affected through the updated value Q(s, α). This can make the learning process slow, as many updates are expected to generate a reward to the relevant previous states and actions [92]. For more details, see Algorithm 1 in [92].

MIMO Systems
In this section, we present the optimized algorithms, which use different techniques for the channel estimation in MIMO systems. In [80], the scenario focuses on a multi-BS and multi-UE in RIS-assisted cell-free systems and investigates the multi-BS cooperation and multi-UE joint estimation. For a cascaded channel estimation, a 3D-MMV framework has been used to jointly estimate the cascaded AoDs for all users in which the BS and RIS and multi-UE channels contribute to a common part (BS-RIS) (characteristic 1). They additionally use tensor contraction to present a 3D-MLAOMP algorithm. Moreover, when UE and RIS and multi-BS channels communicate on a common part (RIS-UE) (characteristic 2), it is not implemented in the cascaded channel estimation. For more details, see Algorithm 1 in [80].

Two-Stage Based Cascaded Channel Estimation for a Multi-User System
In [25], they propose a two-stage method for the uplink cascaded channel estimation without using a typical user in an RIS-assisted MU-MIMO cell system. Specifically, in the first stage, an ambiguous shared RIS-BS channel is constructed with all users jointly, to differentiate the multi-user gain and reduce propagation errors. In the second stage, each user estimates its channel together with the RIS and obtains full CSI of the cascaded channel and analyzes the required pilot overhead. For more details, see Algorithm 1 in [25].

Algorithm for an RIS-Assisted AB-HBF System
The authors in [17] create an RIS-assisted AB-HBF mMIMO cell system, as mentioned in Section 2. Their purpose is to maximize the achievable system rate by reducing the CSI overhead size and hardware complexity. First, the RF beamformers are designed based on slow-time varying angular parameters. Then, the BB precoder/combiner is designed with a SVD and water filling algorithm using an efficient channel with reduced size. Finally, to enhance the system capacity, the phase shifts of the RIS use a PSO method. For more details, see Algorithm 1 in [17].

Channel Estimation Algorithms for the Cases with Long-Term Imperfection (LTI) and Short-Term Imperfection (STI)
In [1], the authors propose different efficient tensor algorithms for channel estimation in RIS-assisted MIMO systems, with the RIS elements affected by real-world imperfections. Non-ideal channel estimation problems are solved with trilinear and quadrilinear PARAFAC. The proposed trilinear ALS (TALS)-based LTI algorithm solves with static imperfections. The TALS-STI and higher-order singular value decomposition (HOSVD)-STI algorithms are used for demanding scenarios with non-static behavior of RIS imperfections and with channel temporal coherence. Furthermore, the TALS-LTI and TALS-STI algorithms have iterative solutions, to relax the system requirements and work with more options for training parameters. The HOSVD-STI algorithm has a closed-form solution, has a lower computational complexity than ALS algorithms, and performs parallel processing. For more details, see Algorithms 1-3 in [1].

Results of the Proposed Algorithms
In this section, we present the results of the optimized algorithms for RIS-assisted MISO and MIMO systems.

MISO Systems
According to the results of [74], the PPO-based algorithm is used for joint active and passive beamforming for RIS-assisted MU MISO systems. They use S-CSI plots at the beamforming vectors of the BS and at the phase shifts at the RIS. According to their simulation results for the low and moderate SNR, the S-CSI models achieve equal performance to the I-CSI models and quickly converge to compare with others of the same category. In addition, the PPO-based method outperforms the asynchronous advantage actor-critic (A2C)-based method [92] and is robust against receiving many random actions, resulting from the use of the clip function. The studying time for the S-CSI PPO-based is considerably lower than that of the I-CSI PPO-based algorithm. Regarding the transmission power effect of the BS, the performance of the S-CSI PPO-based approach is compared with the alternating direction method of multipliers (ADMMs) suggested in [75]. Furthermore, the I-CSI-based method validates its operation against the algorithm proposed in [93]. Although the performance of all algorithms is the same at the low SNR, when the SNR rises, the I-CSI PPO-based algorithm outperforms the S-CSI PPO-based algorithms, the A2C and ADMM-based algorithms. Furthermore, the above algorithms perform significantly better than the RIS random phase shifts and the no-RIS case. For the effect of the Rician factor on the average sum, the I-CSI PPO-based algorithm is compared with the algorithm in [93]. According to the results of [74], by increasing the Rician factor, the performance of all algorithms with S-CSI and I-CSI enhances. Specifically, for the S-CSI approach, the LoS link becomes dominant as it increases, the BS-RIS-UE link turns out to be more deterministic.
In [92], the results show that in the proposed framework, it is possible to robustly train neural networks over RL with both value-based and policy-based techniques, nonpolicy, as well as policy methods and discrete and continuous domains. One of the main conclusions is that the use of peer actor-learners to inform a common model has a steadying effect on the learning process of the three value-based methods examined. Although this indicates that stable online Q-learning is possible without experience repetition, which was used for this objective in the deep Q-Networks (DQN) algorithm, it does not mean that experience repetition is not beneficial. Combining experience repetition into the framework of asynchronous RL could significantly increase the data efficiency of these methods by reprocessing old data. This leads to much quicker training times in areas, such as the open racing car simulator (TORCS), where cooperating with the environment is more costly than renewing the model for the architecture used.
In [72], the problem of planning the transmission joint beam and phase shifts in an RIS-assisted MISO communication system was addressed. Only S-CSI is available at the transmitter and efficient algorithms maximize the system performance. For the case of the Rician fading, an iterative algorithm is used, and the convergence of the algorithm is established. For the Rayleigh fading case, closed-form designs are used. Finally, the proposed S-CSI-based algorithm achieves a similar performance to the algorithm that requires I-CSI.

MIMO Systems
In [1], the authors use tensor algorithms for channel estimation in RIS-assisted MIMO systems, with trilinear and quadrilinear PARAFAC and it is referred to in Section 3. The results are that the proposed algorithms have a high-performance estimate for imperfections in the channel model and system settings. Furthermore, TALS-STI and HOSVD-STI algorithms have a similar performance. The TALS-STI algorithm is preferred for defect detection in the low SNR regime and when more flexible choices for the training parameters are required. Finally, HOSVD-STI is preferred when low processing latency is desired.
In [80], the authors analyze an RIS-assisted cell-free mMIMO system, investigate the multi-BS cooperation and multi-UE joint estimation. They use the 3D-MLAOMP algorithm, and it is referred to in Section 3. In future work, the derivation of the results for timescale channel estimation through multi-BS cooperation will be explored. In addition, non-orthogonal pilot sequences based on RIS-assisted cell-free channel estimation in highmobility scenarios will be investigated.
In [79], a closed-form analytical expression on the achievable sum of RIS-assisted cell-free systems is proposed, this evaluates the effect of key parameters of the system performance. To gain more information, a special case nLoS element was investigated with a power gain in the order of O(M). The proposed low-complexity algorithm uses a two-time-scale transmission protocol, so that the joint beamformers in the BS and RIS are optimized to increase the achievable weighted sum. In addition, the RIS beamformers were improved based on the penalty double decomposition (PDD) method exploiting S-CSI, while the BS beamformers were designed by the primal double degradation (PDS) method dependent on I-CSI. The influence of key system parameters, such as the number of RIS elements, CSI settings and the Rician factor was tested. Finally, the advantages of adopting the cell-free paradigm and developing the exploitation of RISs were demonstrated.
In [2], a joint channel estimation and data detection (JCEDD) scheme for a HRISassisted mmWave orthogonal time frequency space (OTFS) system was proposed. They suggested a transmission structure, where the OTFS blocks were escorted by several pilot sequences. In the duration of the pilot sequences, partial HRIS elements were alternatively initiated in passive mode, and the impinging signal was entirely absorbed. The time domain channel model was investigated. In addition, the received signal model at both the HRIS and the BS was presented. Because CSI between the UE and the HRIS is obtained by the pilot sequences. HRIS beamforming design strategy improves the received signal strength at the BS. For the OTFS transmission, a JCEDD scheme was proposed. In this scheme, they resorted to a probabilistic graphical model, and designed a message passing (MP) algorithm to concurrently recover the data symbols and assess the equivalent channel gain. Moreover, an expectation maximization (EM) algorithm was employed to acquire channel parameters, i.e., the channel sparsity, the channel covariance, and the Doppler frequency shift. By iteratively processing between the MP and EM algorithms, the delay-Doppler domain channel and the transmitted data symbols can be acquired at the same time. Simulation results were required to validate the proposed JCEDD scheme and its robustness to the user velocity.
In [3], the effect of channel aging on RIS-assisted mMIMO systems is studied and the effects of spatial correlation and I-CSI are considered. Correlation of channel aging and the Rayleigh attenuation in the data transmission phase and uplink training phase is introduced, the channel efficiency is estimated, and the DE attainable closed-form normalized zeropressure RZF and sum SE is presented. Channel growth regarding the RIS phase shifts and power budget constraints is presented, applying the S-CSI-based alternating optimization (AO) algorithm to reduce computational complexity and feedback overhead. Therefore, the impact of channel aging and how it interacts with other fundamental parameters affecting the performance were illustrated. For example, the proper range of the numbers of RIS elements and frame duration reduces channel aging. In the future, the study of broadband systems is proposed. Table 2 summarizes the contributions of some studies, that are mentioned above, on channel estimation in different RIS-assisted system settings.
The research gaps arising from the mentioned literature for channel estimation in RIS-assisted MISO systems are related to improving the passive modulation of the received signals from the RIS to the BS and the other way around. Furthermore, it is worth researching and using more RL algorithms, expanding the use of neural networks, and improving their architecture. Furthermore, research gaps in RIS-assisted MIMO systems concerning channel estimation during multi-BS cooperation, in high user mobility scenarios, nonorthogonal pilot sequences and window refresh schemes have not been investigated. CSI estimation has not been investigated in broadband systems.
• Two-stage based cascaded channel estimation for a multi-UE without choosing a typical user.

•
The required average pilot overhead of the proposed two-stage method and the typical user required method is much lower than the direct-OMP method and the double-structured (DS)-OMP method.

Conclusions
The current research delivered an inclusive and up-to-date survey of papers in RISassisted wireless communication, such as MIMO and MISO systems communications. Emphasis was placed on the practical challenges of BS, RIS and user channels to estimate them using optimized algorithms with different channel models and system configurations. Then, for various practical scenarios of available CSI, namely I-CSI and S-CSI, we introduced a detailed overview of the research results, depending on the system structure. Different model systems were presented for MISO and MIMO systems using techniques, such as tensor, RL, SDR and others to find the optimal channel estimation algorithm. For the MISO systems, the SDR technique algorithm, the Q-learning RL algorithm and the main PPO algorithm were presented. The PPO algorithm has presented a low and moderate SNR, the BS transmission power was similar to the ADMM algorithm. The preferred algorithms were I-CSI, S-CSI type and the BS-RIS-UE links, which were deterministic. The Q-learning RL algorithm increased the performance and reprocessed the received data. For MIMO systems that were divided into cell and cell-free, some of the algorithms presented were the 3D-MMV 3D-MLAOMP algorithm, two-stage based cascaded channel estimation algorithm, algorithm for an RIS-assisted AB-HBF system and algorithms that use LTI and STI methods. Of the mentioned algorithms, the 3D-MMV 3D-MLAOMP algorithm has been more interesting because it is a tensor-type algorithm, it has presented a low SNR and is preferred for low processing latency. Table 2 presented the characteristics of the above algorithms. Considering all the above, we found that channel estimation in RIS-assisted communication systems is improved by creating new or optimizing existing algorithms. Improvement was observed in estimation time, performance, how the BS-RIS and RIS-UE channels are calculated and create scenarios for users with high mobility. It was found that the performance of the system depends on the location of the RIS and how many elements it consists of. As a result, the optimized algorithms helped to reduce the pilot signal header and their complexity. For instance, for an RIS-assisted MU-MISO system in [74], an optimized PPO-based algorithm was proposed, which used S-CSI, and the algorithm had a fast convergence compared to same category algorithms. In addition, for an RIS-assisted mMIMO system in [3], they proposed an optimized algorithm for high user mobility with efficient complexity O(MN 2 + N +M) and used S-CSI. According to the researchers, it is necessary to further study the methods in broadband systems, to study more the required number of RIS elements to have the maximum performance of the system and the window size to reduce the channel aging. The contribution of neural networks to channel estimation in RIS-assisted systems should be further investigated. In addition, there should be more scenarios for high user mobility versus channel aging. In our future work, we will simulate the mentioned algorithms for RIS-assisted MISO and MIMO systems, respectively, to confirm their optimization in terms of channel estimation. According to the settings of the systems used by these authors, but also by simulating them with other parameters. We hope this research provides information to researchers and professionals working on the new technologies of communication systems in order to further explore the problems that have emerged.