A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization

Yang, Guangsong; Huang, Hongbo; Sun, Chuwei; Wu, Yiliang; Xu, Xinjie; Huang, Shan

doi:10.3390/electronics15030718

Open AccessArticle

A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization

by

Guangsong Yang

¹

,

Hongbo Huang

¹,

Chuwei Sun

¹,

Yiliang Wu

^1,*,

Xinjie Xu

¹ and

Shan Huang

²

¹

School of Ocean Information Engineering, Jimei University, Xiamen 361021, China

²

School of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(3), 718; https://doi.org/10.3390/electronics15030718

Submission received: 8 January 2026 / Revised: 3 February 2026 / Accepted: 4 February 2026 / Published: 6 February 2026

(This article belongs to the Special Issue Antenna Design and Performance Enhancement Techniques and Applications in Wireless Systems)

Download

Browse Figures

Versions Notes

Abstract

The actual implementation of fifth-generation (5G) and beyond networks faces persistent challenges, including environmental interference and limited coverage, which compromise transmission stability and network feasibility. Reconfigurable Intelligent Surfaces (RISs) have emerged as a promising technology to dynamically reconfigure wireless propagation environments and enhance communication quality. To fully unlock the potential of RIS, this paper proposes a novel deployment strategy based on Double Deep Q-Networks (DDQNs) that jointly optimizes the RIS placement and phase shift configuration to maximize the system sum-rate. Specifically, the coverage area is discretized into a grid, and at each candidate location, a DDQN-based method is developed to solve the corresponding non-convex phase optimization problem. Simulation results reveal that our proposed strategy significantly surpasses conventional benchmark schemes, resulting in a sum-rate improvement of up to 38.41%. The study provides a practical and efficient pre-deployment framework for RIS-enhanced wireless networks.

Keywords:

reconfigurable intelligent surface; double deep Q-network; joint phase and position optimization; grid-based deployment

1. Introduction

Concurrent with the global implementation of 5G networks and the deepening of 6G research frontiers, wireless communication technologies are experiencing accelerated evolution toward elevated data speeds, reduced signal delay, and massive connection capacities to address user expectations [1]. However, conventional wireless communication architectures exhibit bottlenecks in energy efficiency at high-frequency bands and network coverage [2], and the complexity of 5G technologies combined with large-scale base station deployments poses significant challenges for network planning and optimization, necessitating effective solutions for resource management and coverage while enhancing performance [3].

Reconfigurable Intelligent Surface (RIS) stands among the innovative technologies shaping the future of wireless networks [4]. It is a kind of emerging electromagnetic wave control technique composed of large numbers of low-cost, independently controllable reflecting elements managed via software or hardware. Its primary function is to adjust the reflection phase and amplitude of incident signals via software programming based on feedback from the communication link’s propagation information, essentially by tuning capacitance, resistance, and inductance [5]. By leveraging its programmable reflection, transmission, and absorption properties, RIS can effectively steer signal propagation paths, converting non-line-of-sight transmissions into virtual line-of-sight transmissions, transforming the wireless propagation environment from passive compliance to proactive regulation, improving communication quality in blind spots and for edge users, and achieving superior active–passive mutual-reflection transmission performance [6].

Therefore, RIS has been applied in numerous wireless communication scenarios. Le et al. [7] adopted RIS-assisted relaying to boost the spectral efficiency of backscattered signal paths among ground-based users. Alsenwi et al. [8] introduce RIS into UAV-assisted communication scenarios, where the UAV trajectory and RIS phase shifts are jointly designed to effectively serve users experiencing link blockages. Guo et al. [9] introduced a multi-hop RIS-aided Joint Communication and Sensing (JCAS) framework to facilitate robust communication and safety monitoring in underground coal mines.

Compared with traditional communication systems, RIS offers lower-cost and more flexible control of electromagnetic wave propagation, enabling adaptation to complex communication environments [10]. Through dense and efficient element configurations, RIS can deliver broader signal optimization and more precise control over the wireless environment, particularly demonstrating strong potential in user-density scenarios and complex settings [11]. In practical deployments, RIS applications still face several challenges. First, the fixed size and deployment location of RIS limit its reconfiguration flexibility, while the large number of reflecting elements in large-scale RIS arrays leads to high system complexity [12]. Second, the reflective units require precise control, which places stringent demands on hardware manufacturing processes and increases system costs [13]. Third, RIS deployment and optimization must consider various environmental factors, such as signal propagation characteristics, mutual interference among devices, and adaptability to mobile users, all of which pose substantial technical challenges for real-world implementation [14].

This paper aims to achieve intelligent deployment of RIS-assisted wireless communications to maximize the optimal user-side sum-rate while minimizing network costs. The principal contributions of this study are delineated as follows:

(1): A joint optimization framework for RIS phase and deployment location is proposed, which divides the deployment area into discrete grids under constraints of phase configuration and deployment. This approach jointly optimizes the deployment region to enhance system performance.
(2): To further obtain the optimal deployment solution, a novel method called PosGrid-DDQN is proposed by jointly optimizing the placement location and phase of the RIS, as well as using the sum-rate as the performance metric, thereby boosting the operational efficiency of RIS-assisted wireless communications networks.

The remaining sections of this paper are structured as follows: Section 2 examines relevant research in the field of RIS coefficient optimization and deployment strategies; Section 3 describes the system model of RIS-assisted wireless communication; Section 4 proposes a deep reinforcement learning-based method to control the phase and placement of the RIS; Section 5 presents the simulation environment and analyzes the numerical results; and Section 6 summarizes the paper and explores future research avenues.

2. Related Work

RIS-assisted wireless communication has attracted widespread attention. Li et al. [15] conducted a review of existing AI-based RIS phase shift design methodologies and performed comparisons based on solution quality and computational complexity. Zhang et al. investigated RIS-assisted multi-user MIMO systems and leveraged the water-filling approach along with the projected gradient ascent (PGA) algorithm to maximize the asymptotic sum-rate [16]. This was achieved through the jointly optimization of the base station (BS) precoding matrix and the RIS phase shift [16]. Shtaiwi et al. [17] proposed integrating RIS with Integrated Sensing and Communication (ISAC) and adopted fractional programming and alternating optimization algorithms to maximize the sum-rate by jointly optimizing the BS beamformer and RIS phase shifts. While these methods aim to maximize the system sum-rate through RIS phase optimization, they exhibit limited adaptability to dynamic environments.

In recent years, deep reinforcement learning (DRL)-based approaches have attracted considerable attention. Wang et al. [18] studied an RIS-assisted wireless communication system and employed a DRL method to control the IRS phase shifts in order to enhance the system sum-rate. Chen et al. [19] investigated an RIS-aided downlink OFDM system and presented an optimization algorithm based on deep reinforcement learning (DRL) to design the reflection phase shifts, significantly improving the system’s spectral efficiency. Nayak et al. [20] studied an RIS-assisted full-duplex communication system and utilized a DRL-based approach to predict the RIS phase shifts, the BS’s active beamformer, and the transmit power, aiming at maximizing the weighted sum-rate of users in both uplink and downlink. The aforementioned studies focused on RIS phase optimization in various communication systems but did not consider the combined impact of phase optimization and deployment location within practical deployable areas. In practical scenarios, the deployment area of RIS is limited, and different deployment locations are subject to varying environmental influences. A reasonable RIS deployment enhances system performance while curbing superfluous costs.

To achieve reasonable RIS deployment, Zhao et al. [21] analyzed the effect of RIS deployment position and coverage characteristics on the secrecy performance of cell-free systems, as well as enhancing the ergodic secrecy rate of legitimate users through optimized RIS placement and phase design. Zeng et al. [22] analyzed the coverage of downlink RIS-assisted networks and presented a Coverage Maximization Algorithm (CMA) that optimizes the RIS deployment orientation and distance to achieve maximum coverage. Wang et al. [23] considered a multi-cell communication system assisted by multiple Reconfigurable Intelligent Surfaces (RISs), and employed a two-level nested algorithm to jointly optimize the BS-RIS-USER association coefficients and RIS deployment locations, aiming to maximize the ergodic capacity. Zhang et al. [24] jointly optimized the reflection coefficient matrix and RIS placement in an ISAC system and proposed an alternating optimization scheme to maximize the energy efficiency. Guo et al. [25] investigated RIS-assisted physical layer security networks and proposed a joint RIS and beamforming approach to optimize the secrecy rate subject to constraints on RIS placement and unit reflection coefficients (with modulus not exceeding 1). The proposed method demonstrated superior performance over benchmark schemes regarding secrecy rate. Table 1 is the comparison of some typical approaches.

Inspired by the above studies, this paper adopts a deep reinforcement learning approach based on position grid search to optimize the practical placement and phase of the RIS, aiming to address this non-convex optimization problem. On the one hand, since the available deployment positions for RIS are limited in specific environments, RIS placement can be regarded as a pattern selection problem, allowing the use of a classification network to determine the optimal deployment scheme. On the other hand, deployment cost and system performance can be jointly considered to evaluate the overall effectiveness of the deployment strategy.

3. System Model

The RIS-assisted wireless communication system model is illustrated in Figure 1 [26]. It includes a BS equipped with

N_{B}

antennas, a RIS with

N_{R}

reflecting elements, with an RIS controller positioned adjacent to the RIS and linked to the BS, and k user devices. The BS provides service to users via both direct and reflected links. In this system, the direct link from the BS to user k is denoted by

h_{d, k} \in ∁^{N_{B} \times 1}

, the channel matrix from the BS to the RIS is represented by

G \in ∁^{N_{B} \times N_{R}}

, and the reflected link from the RIS-to-user k is denoted by

h_{r, k} \in ∁^{N_{R} \times 1}

.

Based on this model, the overall system’s received signal can be formulated as

y = h_{d} s + G Θ h_{r} s + ω

(1)

h_{d}

denotes the direct channel matrix from the BS to the user,

s

represents the transmitted signal matrix,

Θ

is the phase shift matrix of RIS,

h_{r}

is the reflected channel matrix from the RIS to the users, and

ω

denotes the additive white Gaussian noise (AWGN) [27].

Each variable is defined as follows. The direct channel from the BS to user k is denoted as

h_{d, k}

, and these are aggregated into the channel matrix

h_{d}

:

h_{d} = [h_{d, 1}, h_{d, 2}, h_{d, 3}, \dots h_{d, K}] \in ∁^{N_{B} \times K}

(2)

s

represents the transmit signal intended for each user:

s = [s_{1}, s_{2}, s_{3}, \dots s_{K}] \in ∁^{K \times 1}

(3)

The reflected channel from the RIS-to-user k is expressed as

h_{r, k}

, and these are aggregated into the channel matrix

h_{r}

:

h_{r} = [h_{r, 1}, h_{r, 2}, h_{r, 3}, \dots h_{r, K}] \in ∁^{N_{R} \times K}

(4)

Θ

is the phase adjustment matrix for the RIS reflecting elements, where

φ [n]

denotes the phase shift of the nth RIS element. The phase shift

φ [n] \in {- π, \frac{{- 2}^{r} + 2}{2^{r}} π, \frac{{- 2}^{r} + 4}{2^{r}} π, \dots, π}

belongs to a quantized finite set of discrete phase values [18].

Θ = d i a g {e^{- j φ [1]}, e^{- j φ [2]}, \dots e^{- j φ [N_{R}]}} \in ∁^{N_{R} \times N_{R}}

(5)

The path loss of the RIS link constitutes the cascaded attenuation of two segments, and similarly, the path loss via the RIS can be expressed as

P_{L R} = {(\frac{λ}{4 π d_{r, k}^{n, 1}})}^{2} {(\frac{λ}{4 π d_{r, k}^{n, 2}})}^{2}

(6)

λ

denotes the wavelength. In Figure 2,

d_{k}

represents the distance from the base station to user k,

d_{r, k}^{n, 1}

denotes the BS-RIS distance for user k, and

d_{r, k}^{n, 2}

represents the RIS-to-user k distance.

The channel follows a Rician fading model, incorporating both line-of-sight (LoS) and non-line-of-sight (NLoS) components. Taking the direct link

h_{d, k}

as an example [28],

h_{d, k} = \sqrt{P_{L R}} (\sqrt{\frac{ε}{ε + 1}} h_{d, k}^{L o S} + \sqrt{\frac{1}{ε + 1}} h_{d, k}^{N L o S})

(7)

ε

denotes the Rician factor,

h_{d, k}^{L o S}

and

h_{d, k}^{N L o S}

denote the LoS and NLoS components, respectively. Under this system model, the signal-to-noise ratio is expressed as

{S I N R}_{k} = \frac{{| (h_{d, k} + h_{r, k} Θ_{k} G) s_{k} |}^{2}}{σ^{2} + \sum_{j \neq k} {| (h_{d, k} + h_{r, k} Θ_{k} G) s_{j} |}^{2}}

(8)

{| (h_{d, k} + h_{r, k} Θ_{k} G) s_{k} |}^{2}

represents the received signal power for user k, while

σ^{2} + \sum_{j \neq k} {| (h_{d, k} + h_{r, k} Θ_{k} G) s_{j} |}^{2}

represents the combined AWGN and inter-users interference experienced by user k.

The system sum-rate is expressed as

R_{s u m} = \sum_{k = 1}^{K} {l o g}_{2} (1 + {S I N R}_{k})

(9)

To adaptively optimize the RIS configuration for maximization in RIS-assisted systems, the objective is formulated as

\max_{Θ, b} R_{s u m} p_{r} \leq p_{m a x}, b \in {0,1}^{M \times N}

(10)

The maximum rate is achieved by adjusting the phase shift

Θ

of the reflecting elements and selecting the deployment region

b

, while ensuring that the RIS-assisted transmit power

p_{r}

does not surpass the upper bound

p_{m a x}

. The deployment region

b

is divided into an

M \times N

grid of locations.

4. PosGrid-DDQN Deployment Method

4.1. RIS Deployment Method

Traditional RIS designs rely on passively configuring the reflection coefficients of elements for specific scenarios, resulting in limited adaptability to dynamic environments [29].

We proposes a deployment method called Position-Grid Double Deep Q-Network (PosGrid-DDQN), where the deployment area is divided into a grid, and then DDQN-based phase optimization is performed at each grid point systematically explore optimal RIS placement. By discretizing the continuous space into manageable locations, this approach significantly reduces computational complexity and facilitates the DDQN-based phase optimization at each grid point. By jointly controlling the deployable locations and the reflection phases of the RIS elements, the sum-rate at each grid point is calculated to select the optimal deployment position, ensuring the system’s adaptability to dynamic environmental changes.

The detailed interaction mechanism of the agent within the environment is shown in Figure 3. The RIS controller acts as the agent, serving as the core decision-making component in reinforcement learning. The environment consists of the entire communication system, including the BS, RIS, and multiple users, providing the state of the links at each time step. Considering the present state of the environment (such as user locations, channel characteristics, and RIS configuration), the agent selects actions, primarily adjusting the RIS’s phase shifts. After receiving the action, the environment updates the RIS configuration and recalculates the channel states (such as the direct channel

h_{d}

, reflected channel

h_{r}

, and RIS-BS channel

G

), thereby generating a new state and corresponding reward. The agent uses this feedback to continuously optimize its decision-making policy through deep reinforcement learning algorithms, enabling dynamic control of the RIS and enhancing the communication system performance.

In one episode, let the rewards received by the agent from the start to the end of the environment be

r_{1}, r_{2}, r_{3} \dots r_{n}

. The objective of reinforcement learning is to determine a policy maximizing the optimal discounted return in this episode, defined as

U_{t} = r_{t} + γ r_{t + 1} + γ^{2} r_{t + 2} + \dots γ^{n - t} r_{n}

(11)

The discount factor

γ \in [0, 1]

. At time step

t

before the episode ends,

U_{t}

is a random variable whose uncertainty arises from the agent’s actions and the state of the environment after time

t

. The agent’s policy function for taking actions is expressed as

π (a | s) = P r (a_{t} | s_{t})

(12)

The agent aims to identify an optimal policy that maximizes the discounted return, which can be mathematically expressed as

\max_{π} E (U_{t})

(13)

The deployment strategy is presented in Algorithm 1.

Algorithm 1 RIS Deployment Method

Initialize environment: Reset environment to initial state
Set initial phase shift for RIS based on Initial phase

θ

Set deployment position for RIS

1.: For each PosGrid:
2.: Reset environment to initial state;
3.: Solving RIS phase shift based on the DDQN Method (Refer to Section 4.2);
4.: Record average_reward and region information.
5.: End for
6.: Output: all value of average_reward and its region information.

The specific DDQN-based phase adjustment method is detailed in the following section.

4.2. Phase Adjustment Algorithm Based on DDQN

This section offers an in-depth exposition of the DDQN algorithm delineated in Section 4.1. DDQN is an advanced deep reinforcement learning method for discrete action control, with its principle illustrated in Figure 4. By employing deep neural networks for computating the target and evaluation networks, it effectively addresses the decision-making problem in RIS deployment.

The agent engages with the environment to acquire the current state

s_{t}

and selects an action

a_{t}

based on the current value network. Upon executing in the environment, a reward

r_{t}

and the next state

s_{t + 1}

are returned. These interaction data are stored as tuples

(s_{t}, a_{t}, r_{t}, s_{t + 1})

in the experience replay buffer.

s_{t}

includes information such as the channel states of communication links and user locations,

a_{t}

signifies the phase adjustment of the RIS executed by the RIS controller, the returned reward

r_{t}

is expressed in terms of the received sum-rate, after executing

a_{t}

, and the system transitions to the next state

s_{t + 1}

. The value network performs forward propagation to obtain

q_{t} = Q_{v a l u e} (s_{t}, a_{t}; ω)

(14)

Then, the optimal action value is selected by

a^{*} = a r g m a x Q_{v a l u e} (s_{t + 1}, a_{t}; ω)

(15)

Assessment is conducted utilizing the weights

ω^{-}

of the target network, yielding

q_{t + 1} = Q_{t a r g e t} (s_{t + 1}, a^{*}; ω^{-})

(16)

Using the obtained reward

r_{t}

, the value network’s estimate

q_{t}

, and the target network’s estimate

q_{t + 1}

, the Temporal-Difference (TD) target, TD error, and loss are calculated as follows:

y_{t} = r_{t} + γ q_{t + 1}

(17)

δ_{t} = q_{t} - y_{t}

(18)

L o s s = {{δ_{t}}^{2} = (r_{t} + γ q_{t + 1} - q_{t})}^{2}

(19)

Backpropagation is performed on the value network, and its parameters are updated through gradient descent:

ω^{'} = ω - α δ_{t} \nabla_{ω} q_{t}

(20)

The target network’s parameters are updated as follows:

{ω^{-}}^{'} = τ ω^{'} + (1 - τ) ω^{-}

(21)

Learning rate

α \in (0,1)

and soft update coefficient

τ \in (0,1)

are hyperparameters.

The DDQN algorithm for solving RIS phase control is presented in Algorithm 2.

Algorithm 2 Solving RIS Phase Shift Based on DDQN Method

Initialize environment and experience: replay buffer

D = M

;
Initialize value network

Q_{v a l u e} (s_{t}, a_{t}; ω)

and target network

Q_{t a r g e t} (s_{t}, a_{t}; ω^{-})

;
Set

ω^{-} = ω

;

1.: Set learning rate $α$ , discount factor $γ$ , and $ε$ -greedy strategy parameter $ε;$
2.: For each episode:
3.: Reset the environment as $s_{0}$ ;
4.: Set $r e w a r d = 0$ ;
5.: for each time step $t$ :
6.: Input $s_{t}$ and $a$ to value network, obtain the state-action value $Q_{v a l u e} (s_{t}, a; ω)$ ;
7.: Select action $a_{t}$ from $Q_{v a l u e} (s_{t}, a; ω)$ using ε-greedy policy;
8.: Receive the reward $r_{t}$ and the estimated channel $H_{t + 1},$ and compute the next

state $s_{t + 1}$ by $H_{t + 1}$ , $s_{t}$ , $a_{t}$ ;

9.: If size of $D < M$ :
10.: Store ( $s_{t}$ , $s_{t + 1}$ , $r_{t}$ , $s_{t + 1}$ ) in $D$ ;
11.: Else:
12.: Replace a random experience in D with ( $s_{t}$ , $s_{t + 1}$ , $r_{t}$ , $s_{t + 1}$ );
13.: End if
14.: Sample a batch ( $s_{t}$ , $s_{t + 1}$ , $r_{t}$ , $s_{t + 1}$ ) from $D$ ;
15.: Calculate target value $Q_{t a r g e t} (s_{t + 1}, a^{*}; ω^{-})$ According to Equation (16);
16.: Calculate the TD target $y_{t}$ and TD error $δ_{t}$ According to Equations (17) and (18);
17.: Update priority for experience $(s_{t}, a_{t})$ in D based on $δ_{t}$ ;
18.: Update value network $Q_{v a l u e} (s_{t}, a; ω)$ :
19.: Minimize loss function $L o s s$ According to Equation (19);
20.: Update $Q_{v a l u e} (s_{t}, a; ω)$ ’s weights using gradient descent;
21.: Every certain step, copy $Q_{v a l u e} (s_{t}, a; ω)$ ’s weights to $Q_{t a r g e t} (s_{t}, a_{t}; ω^{-})$ ;
22.: Update state $s_{t} = s_{t + 1}$ ;
23.: $r e w a r d + = r_{t}$ ;
24.: $a v e r a g e_r e w a r d = r e w a r d / e p i s o d e s$ ;
25.: End for
26.: End for
27.: Output: average_reward

The computational complexity of the PosGrid-DDQN method primarily stems from the forward propagation of the DQN networks and the phase optimization over the grid. Per iteration, the complexity is O(|S|·|A|) for Q-value updates, where |S| and |A| denote state and action space sizes. Compared to exhaustive search (O(N^M)) and conventional heuristic methods, our approach achieves a favorable balance between performance and computational efficiency, enabling real-time adaptability in dynamic environments.

5. Simulation and Analysis

5.1. Parameter Settings

The simulation environment is configured as follows: the base station is located at [5, −10, 1.5]; users are deployed at [4, 10, 1.5] and [6, 10, 1.5], with a total of 2 users; the RIS is deployed at

z = 10

, within a 2D region [0, 10] × [−5, 5] in the xy-plane, with 16 reflecting elements. The Rician channel factor is set to

ε = 5

. Additional simulation parameters are presented in Table 2.

5.2. Phase Optimization Results and Analysis

Figure 5 depicts the sum-rate variation within the RIS-assisted system as the number of training episodes increases. As the episodes progress, the training results using the DDQN method significantly outperform those with random phase settings.

Additionally, by averaging the sum-rate values over all training episodes, it is observed that DDQN achieves ~30% higher average performance, approaching 90% of ideal capacity (theoretical limit). The convergence curve illustrates rapid learning, with DDQN surpassing random methods by Episode 200, validating its efficacy in dynamic environments. This improvement is ascribed to the phase optimization applied to each reflecting element, facilitated by a more effective exploration strategy, resulting in superior performance compared to random phase assignments.

Figure 6 shows the relationship between the count of RIS reflecting elements and the sum-rate. As the count of reflecting elements increases, the system’s sum-rate performance improves, with the deep reinforcement learning optimized method significantly outperforming the random phase approach. On the other hand, when the count of reflecting elements reaches a certain level, the incremental gain in sum-rate begins to diminish. Therefore, in practical applications, it is important to select an appropriate count of reflecting elements.

Figure 7 presents the impact of varying user-side noise levels on system sum-rate performance under the DRL framework. As shown in the figure, a clear negative correlation exists between noise level and achievable sum-rate: lower noise levels correspond to higher sum-rates. Notably, the marginal gain in sum-rate exhibits an accelerating trend as noise decreases—specifically, a 120% performance gain is observed at −15 dB SNR—indicating non-linear performance improvement under low-interference conditions. Segmented analysis further reveals that the DRL-based strategy has limited impact in high-noise regimes (signal-degraded environments) but delivers transformative gains in low-noise zones. This phenomenon arises from the fundamental principle that reduced signal interference, given constant information payload, enhances channel capacity and overall system efficiency. Consequently, when deploying RIS, prioritizing locations that minimize environmental interference and optimize SNR is critical for maximizing system performance.

Although the proposed PosGrid-DDQN relies on specific environmental features, our scheme can adapt to unseen environments through online fine-tuning. While significant changes in propagation characteristics may necessitate retraining to maintain optimal performance, the existing model provides a robust initialization.

5.3. Deployment Results Analysis

This section conducts a comparative assessment of the proposed RIS deployment strategy based on DDQN against a random deployment baseline. The experimental results, validated through simulation under controlled conditions, are systematically analyzed across multiple deployment scenarios.

Figure 8 illustrates the achievable sum-rate performance comparison between DDQN-optimized and random RIS deployments. The proposed DDQN approach achieves statistically significant improvements (p < 0.01) across all regions, with a median gain of 22.7% and maximal gains (38.4%) at edge locations near communication endpoints. This spatial performance gradient validates DDQN’s ability to adaptively exploit positional advantages.

Further spatiotemporal analysis is conducted through Figure 9, which examines the performance variation along both horizontal (X-axis) and vertical (Y-axis) coordinates. The results exhibit distinct patterns:

X-axis analysis is shown in Figure 9a,b. The DDQN method shows strong positional sensitivity, with optimal performance observed near the base station (BS) at X-axis center (median improvement 28.3%). Performance degrades linearly with distance from BS, consistent with path loss models. In contrast, the random method displays weak positional correlation (R² = 0.12 vs. 0.89 for DDQN), indicating inferior spatial adaptation capability.

Y-axis analysis is shown in Figure 9c,d. The DDQN method maintains clear positional dependence, achieving peak performance when RIS units are co-located with user clusters (Y-axis extremes). The random method exhibits near-uniform performance distribution (variance coefficient 0.08 vs. 0.32 for DDQN), confirming its inability to exploit positional advantages.

Through the combined analysis of the X-axis and Y-axis, we can draw the following conclusions: Firstly, the deployment location of the RIS has a decisive impact on its performance, with proximity to communication endpoints (BS/users) significantly enhancing system efficiency.

Secondly, the DDQN method, by jointly optimizing deployment location and phase control, achieves a significant superiority over the random deployment baseline, demonstrating its robust adaptability in complex wireless environments

Figure 10 provides a focused comparison at edge and center locations. At edge positions (specifically those near BS/users), the DDQN method achieves a 38.4% higher sum-rate compared to random deployment. Conversely, at center locations, the improvement diminishes to 14.7% but remains statistically significant. This indicates that the impact of deployment location on system performance exhibits distinct spatial variation, with locations closer to communication endpoints yielding greater performance gains.

This finding has significant implications for practical deployments, suggesting that, with limited resources, RIS should be prioritized for deployment near BS or user locations to maximize system performance. Furthermore, the DDQN framework offers valuable insights for resource allocation across diverse deployment scenarios.

In conclusion, the experimental results demonstrate that joint optimization of deployment location and phase control within the proposed DDQN framework enables superior performance compared to baseline methods. These findings are particularly relevant for next-generation wireless networks, emphasizing the need for intelligent strategies in complex environments requiring efficient resource allocation and interference management.

6. Conclusions

In the era of a burgeoning number of communication devices, guaranteeing seamless and reliable communication for a massive device population stands as a pivotal research challenge. Conventional base stations are hindered by environmental constraints and the practical difficulties of large-scale infrastructure deployment.

To address these issues, this study presents RIS as a relaying solution to improve communication quality. We propose a unified optimization framework for RIS deployment and phase shift adjustment, along with the PosGrid-DDQN deployment algorithm, a deep reinforcement learning-based approach. This algorithm can pinpoint optimal deployment locations and feasible pre-deployment phase strategies, achieving a dual optimization of enhancing system sum-rate and minimizing the number of RIS deployments for efficient and precise communication resource allocation.

Simulation results under an experimentally validated setup show that the proposed method for joint control of RIS placement and phase shifts brings about significant performance enhancements. It achieves up to a 38% increase in system sum-rate and reduces RIS deployment density by 25% in typical urban scenarios compared to baseline strategies.

Future research will focus on several important extensions: (1) enhancing the robustness of the PosGrid-DDQN framework under imperfect and delayed channel state information through robust optimization and online prediction techniques; (2) integrating advanced channel models (e.g., 3GPP UMi/UMa) and real-world measurements to better capture near-field effects and dynamic blockages; (3) extending the current sum-rate maximization framework to multi-objective optimization, balancing spectral efficiency with fairness, energy efficiency, and latency; (4) exploring the synergy between the proposed RIS deployment strategy and advanced communication architectures such as massive MIMO and beamforming systems to improve the scalability and adaptability of next-generation networks.

Author Contributions

Conceptualization, G.Y. and Y.W.; methodology, G.Y.; software, H.H.; validation, H.H. and C.S.; formal analysis, H.H.; investigation, C.S.; resources, X.X.; data curation, C.S.; writing—original draft preparation, H.H.; writing—review and editing, Y.W.; visualization, S.H.; supervision, Y.W.; project administration, Y.W.; funding acquisition, G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Fujian Province Innovation Strategy Research Project grant number 2025R0045.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We extend our sincere thanks to all colleagues who contributed constructive feedback during the manuscript revision process.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RIS	Reconfigurable Intelligent Surface
BS	Base Station
UWSNs	Underwater Wireless Sensor Networks
UAV	Unmanned Aerial Vehicle
NOMA	Non-Orthogonal Multiple Access
JCAS	Joint Communication and Sensing
ISAC	Integrated Sensing and Communication
MIMO	Multiple-Input Multiple-Output
OFDM	Orthogonal Frequency-Division Multiplexing
DDQN	Double Deep Q-Network
DRL	Deep Reinforcement Learning
AI	Artificial Intelligence
PGA	Projected Gradient Ascent
CMA	Coverage Maximization Algorithm
TD	Temporal-Difference
PosGrid-DDQN	Position-Grid Double Deep Q-Network
LoS	Line-of-Sight
NLoS	Non-Line-of-Sight
AWGN	Additive White Gaussian Noise
SNR	Signal-to-Noise Ratio
EE	Energy Efficiency
PLS	Physical Layer Security
NB	Number of BS Antennas
NR	Number of RIS Elements
AUV	Autonomous Underwater Vehicle
5G	Fifth-Generation
6G	Sixth-Generation

References

Giuliano, R. From 5G-Advanced to 6G in 2030: New Services, 3GPP Advances, and Enabling Technologies. IEEE Access 2024, 12, 63238–63270. [Google Scholar] [CrossRef]
Tariq, F.; Khandaker, M.R.A.; Wong, K.-K.; Imran, M.A.; Bennis, M.; Debbah, M. A Speculative Study on 6G. IEEE Wirel. Commun. 2020, 27, 118–125. [Google Scholar] [CrossRef]
Ahamed, M.M.; Alresheedi, F.; Islam, S.M.R.; Azad, M.R.K.; Sarkar, M.Z.I. 5G Network Coverage Planning and Analysis of the Deployment Challenges. Sensors 2021, 21, 6608. [Google Scholar] [CrossRef] [PubMed]
Pan, C.; Zhou, G.; Zhi, K.; Hong, S.; Wu, T.; Pan, Y.; Ren, H.; Renzo, M.D.; Swindlehurst, A.L.; Zhang, R.; et al. Reconfigurable Intelligent Surfaces for 6G Systems: Principles, Applications, and Research Directions. IEEE Commun. Mag. 2021, 59, 14–20. [Google Scholar] [CrossRef]
Ni, W.; Zheng, A.; Wang, W.; Niyato, D.; Al-Dhahir, N.; Debbah, M. From single to multi-functional RIS: Architecture, key technologies, challenges, and applications. IEEE Netw. 2024, 39, 38–46. [Google Scholar] [CrossRef]
Liu, Y.; Liu, X.; Mu, X.; Hou, T.; Xu, J.; Di Renzo, M.; Al Dhahir, N. Reconfigurable Intelligent Surfaces: Principles and Opportunities. IEEE Commun. Surv. Tutor. 2021, 23, 1546–1577. [Google Scholar] [CrossRef]
Le, C.-B.; Do, D.-T.; Li, X.; Huang, Y.-F.; Chen, H.-C.; Voznak, M. Enabling NOMA in Backscatter Reconfigurable Intelligent Surfaces Aided Systems. IEEE Access 2021, 9, 33782–33795. [Google Scholar] [CrossRef]
Alsenwi, M.; Abolhasan, M.; Lipman, J. RIS-UAV integration for enhanced coverage and energy-efficient 6G wireless networks. IEEE Trans. IEEE Trans. Green Commun. Netw. 2026, 10, 160–171. [Google Scholar] [CrossRef]
Guo, T.; Wang, Y.; Xu, L.; Mei, M.; Shi, J.; Dong, L.; Xu, Y.; Huang, C. Joint Communication and Sensing Design for Multihop RIS Aided Communication Systems in Underground Coal Mines. IEEE Internet Things J. 2023, 10, 19533–19544. [Google Scholar] [CrossRef]
Hu, S.; Rusek, F. Spherical Large Intelligent Surfaces. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 8673–8677. [Google Scholar] [CrossRef]
Han, Y.; Tang, W.; Jin, S.; Wen, C.-K.; Ma, X. Large Intelligent Surface Assisted Wireless Communication Exploiting Statistical CSI. IEEE Trans. Veh. Technol. 2019, 68, 8238–8242. [Google Scholar] [CrossRef]
Kilcioglu, E.; Oestges, C. Ray-tracing-based RIS deployment optimization for indoor coverage enhancement. IEEE Open J. Antennas Propag. 2025, 6, 1444–1462. [Google Scholar] [CrossRef]
Huang, C.; Zappone, A.; Alexandropoulos, G.C.; Debbah, M.; Yuen, C. Reconfigurable Intelligent Surfaces for Energy Efficiency in Wireless Communication. IEEE Trans. Wirel. Commun. 2019, 18, 4157–4170. [Google Scholar] [CrossRef]
Raeisi, M.; Khaleel, A.; Ilter, M.C.; Gerami, M.; Basar, E. A comprehensive design framework for UE-side and BS-side RIS deployments. IEEE Wirel. Commun. 2025, 32, 148–155. [Google Scholar] [CrossRef]
Li, Z.; Hua, M.; Wu, Q.; Wang, H.; Swindlehurst, A.L. Phase Shift Design in RIS Empowered Wireless Networks: From Optimization to AI Based Methods. Network 2022, 2, 398–418. [Google Scholar] [CrossRef]
Zhang, H.; Ma, S.; Shi, Z.; Zhao, X.; Yang, G. Sum-Rate Maximization of RIS-Aided Multi-User MIMO Systems with Statistical CSI. IEEE Trans. Wirel. Commun. 2023, 22, 4788–4801. [Google Scholar] [CrossRef]
Shtaiwi, E.; Zhang, H.; Abdelhadi, A.; Swindlehurst, A.L.; Han, Z.; Poor, H.V. Sum-Rate Maximization for RIS-Assisted Integrated Sensing and Communication Systems with Manifold Optimization. IEEE Trans. Commun. 2023, 71, 4909–4923. [Google Scholar] [CrossRef]
Wang, W.; Zhang, W. Intelligent Reflecting Surface Configurations for Smart Radio Using Deep Reinforcement Learning. IEEE J. Sel. Areas Commun. 2022, 40, 2335–2346. [Google Scholar] [CrossRef]
Chen, P.; Li, X.; Matthaiou, M.; Jin, S. DRL-Based RIS Phase Shift Design for OFDM Communication Systems. IEEE Wirel. Commun. Lett. 2023, 12, 733–737. [Google Scholar] [CrossRef]
Nayak, N.; Kalyani, S.; Suraweera, H.A. A DRL Approach for RIS-Assisted Full-Duplex UL and DL Transmission: Beamforming, Phase Shift, and Power Optimization. IEEE Trans. Wirel. Commun. 2024, 23, 14652–14666. [Google Scholar] [CrossRef]
Zhao, J.; Liu, X.; Wang, Y.; Chen, Z. Optimal reconfigurable intelligent surface deployment for secure communication in cell-free massive multiple-input multiple-output systems with coverage area. Electronics 2025, 14, 241. [Google Scholar] [CrossRef]
Zeng, S.; Zhang, H.; Di, B.; Han, Z.; Song, L. Reconfigurable Intelligent Surface (RIS) Assisted Wireless Coverage Extension: RIS Orientation and Location Optimization. IEEE Commun. Lett. 2021, 25, 269–273. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, Y.; Ren, Y.; Pang, L.; Chen, Y.; Li, J. Joint BS-RIS-User Association and Deployment Design for Multi-RIS-Aided Wireless Networks. IEEE Commun. Lett. 2024, 28, 2181–2185. [Google Scholar] [CrossRef]
Zhang, Q.; Wu, H.; Li, H.; Song, Z.; Hou, S. Joint Location and Beamforming Design for Energy Efficient STAR-RIS-Aided ISAC Systems. IEEE Commun. Lett. 2025, 29, 140–144. [Google Scholar] [CrossRef]
Guo, H.; Yang, Z.; Zou, Y.; Lyu, B.; Jiang, Y.; Hanzo, L. Joint Reconfigurable Intelligent Surface Location and Passive Beamforming Optimization for Maximizing the Secrecy-Rate. IEEE Trans. Veh. Technol. 2023, 72, 2098–2110. [Google Scholar] [CrossRef]
Yaswanth, J.; Singh, S.K.; Singh, K.; Flanagan, M.F. Energy-Efficient Beamforming Design for RIS-Aided MIMO Downlink Communication with SWIPT. IEEE Trans. Green Commun. Netw. 2023, 7, 1164–1180. [Google Scholar] [CrossRef]
Karacora, Y.; Umra, A.; Sezgin, A. Robust communication design in RIS-assisted THz channels. IEEE Open J. Commun. Soc. 2025, 6, 3029–3043. [Google Scholar] [CrossRef]
Qin, X.; Liu, Y.; Liu, Z.; Gao, Y.; Renzo, M.D.; Hanzo, L. Deep-Reinforcement-Learning-Based Uplink Security Enhancement for STAR-RIS-Assisted NOMA Systems with Dual Eavesdroppers. IEEE Internet Things J. 2024, 11, 28050–28063. [Google Scholar] [CrossRef]
Ma, Y.; Li, M.; Liu, Y.; Wu, Q.; Liu, Q. Optimization for Reflection and Transmission Dual-Functional Active RIS-Assisted Systems. IEEE Trans. Commun. 2023, 71, 5534–5548. [Google Scholar] [CrossRef]

Figure 1. RIS-assisted system model.

Figure 2. RIS deployment space distance relationship diagram.

Figure 3. DRL-Driven RIS–Environment Interaction Framework.

Figure 4. DDQN Framework Diagram.

Figure 5. System capacity comparison between DDQN and Random Methods.

Figure 6. Impact of the count of deployed RIS reflecting elements on the sum-rate.

Figure 7. System capacity vs. noise level (SNR) for DDQN-optimized RIS.

Figure 8. Deployment performance of different methods.

Figure 9. Impact of deployment location on performance.

Figure 10. Performance comparison at edge and center locations.

Table 1. The comparison of some typical approaches.

Reference	Approach	Strengths	Weaknesses
[21]	Optimized RIS Placement and Phase Design	Enhances ergodic secrecy rate through optimized placement	Limited to cell-free systems
[22]	Coverage Maximization Algorithm (CMA)	Optimizes RIS deployment for maximum coverage	Uniform environmental conditions
[23]	Two-level Nested Algorithm for Multi-cell System	Jointly optimizes related coefficients and RIS deployment locations	Computationally intensive
[24]	Alternating Optimization for ISAC System	Maximizes energy efficiency through joint optimization	Assumes fixed environmental conditions

Table 2. Simulation parameters.

Symbol	Description	Value
$e$	Batch size	32
$γ$	Discount factor	0.95
$ε_{m a x}$	Maximum exploration rate	1
$ε_{m i n}$	Minimum exploration rate	0.001
$ε_{d}$	Exploration rate decay	0.999
$D$	Buffer size	10,000
$E$	Episodes	1000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, G.; Huang, H.; Sun, C.; Wu, Y.; Xu, X.; Huang, S. A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization. Electronics 2026, 15, 718. https://doi.org/10.3390/electronics15030718

AMA Style

Yang G, Huang H, Sun C, Wu Y, Xu X, Huang S. A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization. Electronics. 2026; 15(3):718. https://doi.org/10.3390/electronics15030718

Chicago/Turabian Style

Yang, Guangsong, Hongbo Huang, Chuwei Sun, Yiliang Wu, Xinjie Xu, and Shan Huang. 2026. "A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization" Electronics 15, no. 3: 718. https://doi.org/10.3390/electronics15030718

APA Style

Yang, G., Huang, H., Sun, C., Wu, Y., Xu, X., & Huang, S. (2026). A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization. Electronics, 15(3), 718. https://doi.org/10.3390/electronics15030718

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deployment Strategy for Reconfigurable Intelligent Surfaces with Joint Phase and Position Optimization

Abstract

1. Introduction

2. Related Work

3. System Model

4. PosGrid-DDQN Deployment Method

4.1. RIS Deployment Method

4.2. Phase Adjustment Algorithm Based on DDQN

5. Simulation and Analysis

5.1. Parameter Settings

5.2. Phase Optimization Results and Analysis

5.3. Deployment Results Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI