BICGSTAB-FFT Method of Moments with NURBS for Analysis of Planar Generic Layouts Embedded in Large Multilayer Structures

BICGSTAB-FFT method of moment (MM) scheme is proposed to analyze several levels of planar generic layouts embedded in large multilayer structures when the layout geometries are modeled by NURBS surfaces. In this scheme, efficient computation of normalized error defined in iterative bi-conjugate gradient stabilized (BICGSTAB) method for large multilayer structure analysis problems is implemented. The efficient computation is based on pulse expansion with dense equi-spaced mesh of generalized rooftop basis functions (BFs) defined on NURBS surfaces and equivalent periodic problem (EPP) in order to apply fast Fourier transforms (FFT). Moreover, efficient computation of Green’s functions for multilayer structure is implemented for near and far field regions. Experimental and numerical validations of whole printed reflect array antennas of electrical size between 8 and 16 times the vacuum wavelengths are shown. In these validations, CPU time consumptions of the proposed method are obtained with results between few minutes and half an hour using a conventional laptop.


Introduction
When whole large electromagnetic devices-such as reflectarrays/transmitarrays [1], leaky wave antennas [2], and metasurface antennas [3]-are analyzed, efficient electromagnetic analysis tools of large multilayer structures which host planar generic layouts are required. Known numerical tools-such as finite element (FE) [4], finite difference time domain (FDTD) [5], and MM [6]-are suitable to carry out analysis of these multilayer structures. FE and FDTD methods require volumetric meshes to model the multilayer medium which hosts the layouts. Therefore, a high number of numerical computations on the volumetric mesh slows down the analysis. In order to avoid volumetric mesh Green's functions of multilayer medium is used in MM. In this paper, an efficient iterative version of MM formulation for the full-wave analysis of large multilayer structures, which hosts several levels of planar generic layouts, is shown.
In MM an electric field integral equation (EFIEs) is numerically solved to obtain the current densities induced on the conducting layouts. However, the hypersingular dyadic Green's functions in the EFIE approach difficult the obtaining the induced current densities [7]. To avoid these hypersingular Green's functions, vector and scalar potential Green's functions with weakly singular behavior are more suitable. This last approach is known as mixed potential integral equations (MPIEs) [8]. In MPIEs, one has to face with the computation of multilayer Green's functions consisting of Sommerfeld integrals (SI). Many different techniques have been proposed for speeding up the evaluation of SI.
One technique uses the so-called discrete complex image method (DCIM) to obtain approximations of Green's functions in terms of spherical waves [9][10][11]. This approximation is accurate when the source and observation point are close (near field region) but the approximation leads to inaccurate results when the source and observation points are far (far field region). Therefore, in far field region is preferable a second technique so called rational function fitting method (RFFM) to approximate the Green's functions in terms of cylindrical waves [12][13][14][15]. There are recent improved versions of the two techniques. In [16], the Green's functions of the multilayer substrate are judiciously interpolated in the spatial domain in terms Chebyshev polynomials after extracting the spherical wave behavior of the Green's functions around the source points (which includes the spherical wave behavior produced by the images of the source through the closest layers). In [17], very accurate closed-form asymptotic expressions of the Green's functions were proposed for the far field region. In our paper, the two accurate improved techniques are used for computations of Green's functions in both near and far field regions.
Once the kernels (Green's functions) of MPIEs for multilayer medium are available, an expansion of the unknown current densities is carried out in terms on known BFs weighted by unknown coefficients. When this expansion and method of weighted residual is applied, a linear system of equations is obtained. In this linear system of equations, the elements of the coefficient matrix are obtained in terms of continuous convolutions between the Green's functions and the BFs.
In order to provide a high-order description of the geometry of layout for complex geometries of the layout, NURBS surfaces are used [18]. These NURBS surfaces are efficiently written in terms of piecewise Bézier patches [19,20] using Cox-de Boor transformation algorithm [21]. These Béizer patches are used as domain where the BFs are defined. Therefore, once accurate description of this domain is available, generalized subsectional rooftop BFs are defined on pair of adjacent Bézier patches as it is shown in [18].
Although an efficient and accurate computation of Green's functions in MPIEs approach, and NURBS modeling of geometry of the layout improve the MM, CPU time consumption involved in the direct computation of MM matrix elements is proportional to the size of the matrix. This fact provides computational complexity which is roughly O (N b 2 ) with respect to the number N b of BFs (i.e., number of unknowns) [22,23]. This fact leads prohibitive CPU time consumption when the number of BFs is large, for example when large structures are analyzed. Iterative methods based on MM are more suitable to analyse large structures. Conjugate gradient fast Fourier transform (CG-FFT) iterative method was proposed in [24,25]. In this approach, conventional rooftops are used as BFs defined on a regular equi-spaced mesh. However, in this approach, the number of unknowns increases as the density of regular mesh increases (this increment of density is required as for example in analysis of layouts with thin strips). In [26,27], the analysis of arbitrary geometries is carried out by extending multilevel fast multipole algorithm (MLFMA) [28] or adaptive integral method (AIM) [29] to a multilayer structure. However, single plane of arbitrary metallization geometry is only shown due to the difficulty of apply MLFMA using Green functions of inhomogeneous media. In [30,31], Rao-Wilton-Glisson (RWG) BFs [32] are expanded in terms of pulses for analysis of multilayered structures using conventional pre-corrected FFT algorithm [33]. However, this approach provides accurate results when the source and the observation points are far enough.
In this work, NURBS surfaces are used to take into account a high-order description of the geometry of planar layouts embedded in a multilayer medium. The surface current densities are approximated by generalized rooftop defined over pair of adjacent Bézier patches. In [34], NURBS surfaces and generalized rooftop BFs are also used. In order to speed up the computations of usual continuous convolution between Green's functions and BFs, in [34] pulse expansion of the BFs and EPP approach were proposed. These approaches lead discrete cyclic convolutions instead of the continuous convolutions. Therefore, fast computation of discrete cyclic convolutions can be carried out by FFT procedure. However, these approaches were proposed in [34] using NURBS surfaces to speed up the computation of each row of MM matrix elements for the multilayer periodic structures analysis problems (electrical sizes of the periodic cells less λ, being λ, the vacuum wavelength). In these approaches, discrete cyclic convolutions between Green's functions and each BF are required in the computation of each element of the MM matrix in fast way. In [34], the required number of FFTs is proportional to N b for the computation of the discrete cyclic convolutions. This fact slows down the computations and increments the required memory resources. In this work, we rescue these approaches but, unlike [34], we apply these approaches in iterative MM context to face electromagnetic analysis of whole large planar multilayer non-periodic structures (electrical sizes of tens of λ). In this context, discrete cyclic convolutions between total surface current and charge densities and Green's functions will be involved instead of discrete cyclic convolutions between Green's functions and each BF. Therefore, the required number of FFTs will be significantly reduced leading to faster computations and reduced memory resources. This fact will lead electromagnetic analysis of whole large planar multilayer non-periodic structures (electrical sizes of tens of λ).
The paper is organized as follows. Section 2 will shows the iterative method with NURBS modeling of surfaces for the analysis problem of large multilayered structure with several interfaces with planar generic layout. Normalized errors are defined to solve the resultant system of linear equations by an iterative scheme. In this section, an efficient computation of the normalized error using the pulse expansion and EPP approaches is shown. Section 3 shows results and experimental validations of three analysis of whole reflectarray antennas as an example of large multilayer structures. Finally, conclusions are shown in Section 4. Figure 1 shows a side view of a multilayer medium which hosts Q interfaces with conducting planar generic layouts with negligible thickness. The conducting planar generic layouts are assumed to be PEC. The multilayer medium consists of NC dielectric layers of complex permittivity ε p = ε 0 ε r,p (1 − jtanδ p ) and thickness d p (p = 1, . . . , NC). In the results presented in this paper, the lower limit of the multilayer structure is a ground plane, but free space can be considered. The Q-interfaces host layout with arbitrary planar geometry. The multilayer structure of Figure 1 is illuminated by arbitrary electric field distribution. This incident electric field distribution induces surface current densities J l (x ,y ) on each metallization surface S l hosted in the lth-interface (l = 1, . . . , Q) with planar layouts. According with Figure 1 the surface current densities J l (x ,y ) have 'x' and 'y' components (i.e., z-component is not considered). Time dependence of the type e jωt is assumed and this dependence will be suppressed throughout. The induced surface current densities produce an electric field scattered by the multilayer structure shown in Figure 1. In order to know these induced surface current densities J l (x ,y ) (l = 1, . . . , Q) the following system of MPIEs has to be solved [8]

Description of the Problem
where E t exc (x,y,z = −h Nk ) is the tangential electric field in the observation point (x,y,z = −h Nk ) on the multilayer medium (without layouts). σ l is the induced surface charge density on the surface S l of the layout hosted in the lth-interface. The induced surface charge densities σ l and surface current densities J l are related each other by means of the known continuity equation. G A is the dyadic Green's function for the vector potential and G Φ is the Green's function for the scalar potential respectively of the multilayer medium of the Figure 1. In this work, formulation C of [35] is assumed. In this formulation, the Green's functions for 'x' and 'y' components are equal. In this way, since 'x' and 'y' components of the surface current densities J l (x ,y ) are only considered, the dyadic Green's function can be substituted by the Green's function G A xx for the the x-component of the vector potential.
We would like to point out that, in this work, the Green's function of the multilayer substrate for the x-component of the vector potential G A xx and for the scalar potential G Φ are efficiently obtained in far field region (i.e., when k 0 ρ > 10 with ρ = [(x − x ) 2 + (y − y ) 2 ] 1/2 and k 0 = 2π/λ being λ the vacuum wavelength) by quasi-analytic formulation described in [17]. When the observation and source points are close (i.e., when k 0 ρ < 10) regularization and judiciously interpolation of the Green's function is carried out in similar way to that shown in [16]. In this procedure, the Green's functions of the multilayer substrate are judiciously regularized (extracting the singular and quasi-singular behaviors of the Green's functions around the source points is carried out) and the resultant regularized functions are interpolated in the spatial domain in terms Chebyshev polynomials. In order to solve the MPIE shown in (1), J l (x ,y ) and σ l (x ,y ) are expanded in terms of known BFs J l,j (x,y) (j = 1, . . . , N b,l ) where N b,l is the number of the BFs in lth-interface with layouts The coefficients c l,j (j = 1, . . . , N b,l ; l = 1, . . . , Q) shown in Equation (2) are unknowns to be determined. In this paper, the surfaces of planar generic layout are modeled by NURBS surfaces. The NURBS are efficiently written in terms of piecewise Bézier patches as it is described in [18]. Subsectional BFs 'generalized rooftop' functions are used as BFs J l,j (x,y) (j = 1, . . . , N b,l ; l = 1, . . . , Q). These subsectional BFs (see Figure 2) are defined on two adjacent Bézier patches. Therefore, these BFs are capability to take into account a high-order description of the geometry of the layout. Method of weighted residual is carried out to solve the system of MPIEs given in (1) using 'razor-blade' as weighting function (WF). The 'razor-blade' functions are WFs with unity value defined over the isoparametric curved lines C k,i (i = 1, . . . , N b,k , k = 1, . . . , Q) in the kth-interface with layouts. Each curved line join the centers of the pair of adjacent Bézier patches which discretizes the planar layout surface hosted in kth-interface with metallizations (see Figure 2). In this way, when (2) is introduced in (1) and 'razor-blade' functions are used as WFs, the resultant system of equations for the unknown coefficients c l,j (j = 1, . . . , N b,l ; l = 1, . . . , Q) is obtained where the coefficients V k,i exc of the system of linear equations can be computed by the next line integral The unknowns c l,j (j = 1, . . . , N b,l ; l = 1, . . . , Q) are explicitly shown in (3) through the dependence of V k,i sc . This dependence is linear since linear dependence of the surface current densities with respect to the unknown coefficients c l, The inductive contribution V k,i ind is given in terms of the line integrals along the way C k,i of the bi-dimensional convolutions between dyadic Green's function G A of the vector potential and the expansion of the surface density current in terms of BFs given in (2) as it is shown in (5). Note that in (5) contributions of all interfaces with metallizations are taken into account (i.e., summation from l = 1 to l = Q appears in (5)). The bi-dimensional convolutions involve the integration of singular behavior of the integrand introduced by the dyadic Green's function when the observation point (x,y,z = −h Nk ) and source point (x ,y ,z = −h Nl ) are close.
On the other hand, the capacitive contribution, V k,i cap , is given by the summation of line integrals of the gradient of the bi-dimensional convolution between Green's function G Φ for scalar potential and the expansion surface density charge in terms of BFs given in (2) for all interfaces with metallizations as it is shown in (6). Since the capacitive contribution involves line integrals whose integrands are exact differentials, these line integrals can be analytically expressed as differences of bi-dimensional convolutions as it is shown in (6). In (6), the observation points ( are the centers of the pair of adjacent Bézier patches associated to ith-razor blade WF (i.e., the extremes points of the line C k,i in Figure 2).
Again the bi-dimensional convolutions involve the integration of singular behavior of the integrand introduced by the Green's function when the observation point (x,y,z = −h Nk ) and source point (x ,y ,z = −h Nl ) are close.
The system of linear equations given in (3) can be addressed by computing the coefficient matrix (i.e., MM matrix) of the system of linear equations and solving the unknown coefficients c l,j (j = 1, . . . , N b,l ; l = 1, . . . , Q) by direct inversion of the MM matrix. However, this way shows several drawbacks when the number of unknowns, N b , is very large. First, the CPU time consumption for computing the coefficient matrix of the system of equations is proportional to the size of the coefficient matrix of the system of equations [22,23] (i.e., the computational complexity of CPU time consumption as a function of the number of unknown N b is roughly O(N b 2 )). Therefore, the CPU time consumption can be prohibitive for large number of unknowns, N b . Second, this approach requires save all the terms of MM matrix in computer memory. Third, once the MM matrix is available, the computation of the inverse matrix of large size can show ill conditioned problems [36][37][38]. This paper is focused on the analysis of large multilayer structures which hosts resonant layouts with respect to the working wavelength as, for example the analysis of whole printed reflectarray antennas. Therefore, tens of thousands of unknowns, N b , are expected. Iterative methods to solve the system of linear equations given in (3) for large values of unknowns, N b , is suitable to face the problem. In this way, a normalized error is defined as where, Note that the normalized error, ξ, shown in (7) is zero if V exc = V sc , and ξ = 1 if V sc = 0. In order to reduce the normalized error, an iterative method is applied. The iterative method computes iteratively the coefficients c l,j (j = 1, . . . , N b,l ; l = 1, . . . , Q) which reduce the normalized error, ξ, until threshold, ξ th , is reached. In this work, BICGSTAB method [39] is implemented as the iterative method. The BICGSTAB approach requires several computations of V sc in each iteration. In this way, several computations of inductive, V k,i ind , and capacitive, V k,i cap , contributions with (i = 1, . . . , N b,k ; k = 1, . . . , Q) are required in each iteration. The direct computation of the inductive and capacitive contributions by (5) and (6) show several drawbacks. First, when the observation point is changed (for example, samples used in the quadrature rule for the computation of the line integrals along C k,i in (5) and the extremes points of the line C k,i in (6)) the computations of bi-dimensional convolutions between Green's functions and surface current or charge densities have to be repeated. Second, the bi-dimensional convolutions involved in (5) and (6) require the integrations of singular behavior of the integrands introduced by the Green's functions when the observation point (x,y,z = −h Nk ) and source point (x ,y ,z = −h Nl ) are close. These integrations require specialized quadrature rules [40,41] and/or sophisticated mathematic handling [16,42] to be efficiently computed. Therefore, an efficient computation procedure of the inductive and capacitive contributions is required.
In [34], the authors show an efficient computation procedure of the inductive and capacitive contributions based on pulse expansion and EPP which involve FFT algorithm applied on discrete functions for periodic multilayer analysis problems. In this work, we implement a similar procedure applied to non-periodic structures. In our case, the following discrete functions of (2N x + 2) × (2N y + 2) elements for the surface current and charge densities hosted in the lth-interface with metallizations are proposed as where 0 < m < N x , 0 < n < N y , x m = x 0 + m ∆x, y n = y 0 + n ∆y and ∆x = 2L x /(2N x + 1), ∆y = 2L y /(2N y + 1) being L x and L y are the maximum lengths of the whole multilayer structure in x-and y-directions.
The points (x m , y n ) are equi-spaced mesh points of a very dense mesh (i.e., the values of N x and N y are high values as the values of ∆x and ∆y are low values). Zero padding of the discrete functions is carried out when m > N x , n > N y . The following proposed discrete functions of (2N x + 2) × (2N y + 2) elements which involve multilayer Green's functions are where 0 < m − m < N x , 0 < n − n < N y . In the remaining domain of the discrete variables m − m and n − n symmetric reproduction is carried out as In a similar way to this shown in [34], the EPP and pulse expansion of BFs lead discrete cyclic convolutions between the discrete functions defined in Equations (9)-(12) instead of continuous convolutions between Green's functions and BFs which appear in (5) and (6). These discrete cyclic convolutions can be computed in an efficient way by FFT procedure as it is shown in [34]. This approach provides an efficient technique to compute the continuous convolutions which appear in Equations (5) and (6) when the observation points are the equi-spaced mesh points. Since the equi-spaced mesh are a very dense mesh, the continuous convolutions shown in Equations (5) and (6) can be evaluated in any observation points of the (x,y) of the kth-interface of multilayer structure under study by conventional bilinear interpolation [43] from the values of the elements of the resultant discrete cyclic convolutions.

Numerical Results
In this section, we will show results of four analysis of whole printed reflectarray antennas: (1) shaped beam reflectarray made of three coplanar parallel dipoles; (2) dual polarized broadband focused reflectarray made of two orthogonal sets of three parallel dipoles; (3) circular polarized focused beam reflectarray made of rotated split rings; and (4) focused beam reflectarray made of two sets of four parallel dipoles with small rotations. Figure 3 shows the shaped beam reflectarray which is considered in this work. This reflectarray was designed in [44] to provide sectored-cosecant squared beam at 10.4 GHz. The reflectarray is made of elements based on three coplanar parallel resonant dipoles. The reflectarray is circular and consists of 489 elements arranged in a 25 × 25 grid with cell size 16.5 mm × 16.5 mm (i.e., the electrical size of the reflectarray is L x × L y = 14.3λ × 14.3λ at 10.4 GHz). The elements are printed on an Arlon (ε r,2 = 3.38, tanδ 2 = 0.005) layer 0.508 mm thick placed on top of Rohacell (ε r,1 = 1.12, tanδ 1 = 0.002) layer 3 mm thick. Therefore, according with Figure 1, NC = 2 and Q = 1. The phase center of the feed is assumed to be located at the point of coordinates (x F , y F , z F ) = (−175, 0, 390) (mm) with respect to the center of the reflectarray (see Figure 3). The feed points to the center of the reflectarray. The radiation pattern of the feed is modeled as a function cos 10 (θ) which provides an illumination level −11 dB below the maximum at the edges of the reflectarray. NURBS model of the planar geometry of the elements has been made. These NURBS surfaces are efficiently written in terms of 11,736 Bézier patches and 10,269 generalized rooftops are used in the approximation of the surface current densities induced in the elements of the reflectarray (i.e., the number of unknowns is N b = 10,269).  Figure 4 shows equi-spaced mesh points x = x m , y = y n , z = −h 2 0 < m < N x , 0 < n < N y which are inside of the reflectarray surface L x × L y = 412.5 mm × 412.5 mm. These points are used to evaluate the discrete functions defined in Equations (9)- (13). Values of N x = N y = 2047 are considered in Figure 4. The blue points stand for the outer equi-spaced mesh points to layout while the red points stand for the inner equi-spaced mesh points to layout.   Once continuous convolutions between Green's functions and BFs which appear in (5) and (6) are computed, the inductive, V 1,i ind , and capacitive, V 1,i cap , (i = 1, . . . , N b,1 ) contributions can be available for the BICGSTAB iterative method. In this way BICGSTAB iterative method is implemented to computes iteratively the coefficients c 1,j (j = 1, . . . , N b,1 ) which reduce the normalized error ξ given in (7) and (8) until threshold ξ th = 0.01 is reached. Figure 6 shows the normalized error ξ with respect to the index of iterations to reach the threshold ξ th = 0.01. Results are shown for two equi-spaced mesh: mesh with (2N x + 2) × (2N y + 2) = 4096 × 4096 and mesh with (2N x + 2) × (2N y + 2) = 2048 × 2048. Note that less number of iterations is required to reach the threshold ξ th = 0.01 when (2N x + 2) × (2N y + 2) = 4096 × 4096 than when (2N x + 2) × (2N y + 2) = 2048 × 2048. Since increasing of the density of the equi-spaced mesh produces a more accurate pulse approximation, these results are expected. We would like to point out that a maximum of 2.219 GB RAM memory and 28 s are required when the normalized error ξ has to be evaluated with (2N x + 2) × (2N y + 2) = 4096 × 4096 while that, a maximum of 570 MB RAM memory and 8 s are required when the normalized error ξ has to be evaluated with (2N x + 2) × (2N y + 2) = 2048 × 2048. Therefore, total CPU consumption of 364 s are required for all iterations when (2N x + 2) × (2N y + 2) = 4096 × 4096 while that, 136 s are required when (2N x + 2) × (2N y + 2) = 2048 × 2048. These results are obtained in a laptop computer with processor Intel Core i7-6700HQ, 2.6 GHz of clock frequency with 32 GB of RAM.  Figure 7 shows numerical radiation patterns in elevation plane and azimuth plane at 10.4 GHz obtained by the proposed method for 4096 × 4096 and 2048 × 2048 equi-spaced mesh. These results are compared with results obtained by local periodicity approach and measurements shown in [44]. We can see that agreements between the results obtained with the proposed method and measurements are into the tolerance errors of the layout building and the material dielectric dispersion.

Dual Polarized Broadband Focused Beam Reflectarray Made of Two Orthogonal Sets of Three Parallel Dipoles
In order to show the analysis capability of the proposed method for multilayer structure with several interfaces which host metallizations, dual polarized broadband focused beam reflectarray designed at 11.95 GHz in [45] has been analyzed by the proposed method. This reflectarray is made of elements based on two stacked orthogonal sets of three parallel dipoles (see Figure 1 in [45] for details). The reflectarray was designed in [45] to generate a pencil beam pointing at the values of the angular spherical coordinates θ beam = 16.9 • ϕ beam = 0 • with respect to the system coordinate system located at the center of reflectarray (see Figure 3). The reflectarray is circular and consists of 861 elements arranged in a 33 × 33 grid with cell size 12 mm × 12 mm (i.e., the electrical size of the reflectarray is L x × L y = 15.7λ × 15.7λ at 11.95 GHz in this case). The elements are printed on a thin Diclad 880 (ε r,2 = 2.17, tanδ 2 = 0.0009) layer of 0.127 mm thick. A Diclad 880 layer (ε r,1 = ε r,2 , tanδ 1 = tanδ 2 ) 3.175 mm thick is used as separator with the ground plane. Therefore, according with Figure 1, NC = 2, Q = 2, N 2 = 2, and N 1 = 1. The phase center of the feed is assumed to be located at the point of coordinates (x F , y F , z F ) = (−193, 0, 635) (mm) with respect to the center of the reflectarray (see Figure 3). The feed points to the center of the reflectarray. The radiation pattern of the feed is modeled as a function cos 26 (θ) which provides an illumination level −11 dB below the maximum at the edges of the reflectarray. In this case, 36,162 generalized rooftop are used in the approximation of the surface density currents induced in the elements of the reflectarray (i.e., the number of unknowns is N b = 11,222). Figure 8 shows numerical radiation patterns in elevation plane for X and Y polarization at 11.95 GHz obtained by the proposed method with (2N x + 2) × (2N y + 2) = 4096 × 4096. These results are compared with results obtained by local periodicity approach and measurements shown in [45]. We can see that agreements between the results obtained with the proposed method and measurements are slightly better than the agreements between the results obtained by local periodicity approach and measurements.   . Radiation patterns in elevation plane for X-polarization (a) and Y-polarization (b) for a focused beam reflectarray antenna based on two stacked orthogonal sets of three parallel dipoles at 11.95 GHz designed in [45]. The numerical results obtained with the proposed method for 4096 × 4096 equi-spaced mesh are compared with a local periodicity approach and measurements shown in [45].
The results obtained by the proposed method have required 22 iterations. This number of iterations is higher than the number of iterations required in previous subsection. This is expected since the number of unknowns, 36,162, of the linear system of equations is larger than the number of unknowns used in the previous subsection. In this case, a maximum of 6.496 GB RAM memory and 87 s are required when the normalized error ξ has to be evaluated (total CPU time consumption of 1914 s). These results are obtained in a laptop computer with processor Intel Core i7-6700HQ, 2.6 GHz of clock frequency with 32 GB of RAM.

Circular Polarized Focused Beam Reflectarray Made of Rotated Split Rings
In order to show the analysis capability of the proposed method for curved layout printed in multilayer structures, a circular polarized focused beam reflectarray design has been carried out at 12 GHz to generate at pencil beam pointing at the values of the angular spherical coordinates θ beam = 19 • ϕ beam = 0 • with respect to the system coordinate system located at the center of reflectarray (see Figure 9a). The reflectarray is circular and consists of 489 elements arranged in a 25 × 25 grid with cell size 8 × 8 mm 2 (i.e., the electrical size of the reflectarray is L x × L y = 8λ × 8λ at 12 GHz in this case). The reflectarray is made of elements based on split-ring. The split rings are printed on single layer Roger Duroid 5880 (ε r,1 = 2.2, tanδ 1 = 0.0009) 2.311 mm thick. Therefore, according with Figure 1, NC = 1 and Q = 1. The width, inner, and outer radius of split rings have been fixed at 0.5, 2.75, and 3.25 mm respectively. The reflectarray design has been carried out under local periodicity assumption and variable rotation technique (VRT) [46]. The VRT ensures that the phases of copolar reflection coefficients for circular polarization are twice of rotation angles when the phases of copolar reflection coefficients for linear polarization are shifted 180 • . In this way, the design is carried out in three steps: first, the phase of the reflection coefficients for right handed circular polarization (RHCP) is computed in each element of the reflectarray to focus the beam at 12 GHz in the desired direction by Equation (3.3) of [1]. Second, the rotation angle of each element is computed as the half of computed phase. Third, the arc lengths of each split ring of each rotated element is adjusted to satisfy that 180 • of difference of phase is obtained between the copolar reflection coefficients for linear polarizations at 12 GHz. In the design procedure, an efficient numerical tool of analysis of multilayer periodic structure based on NURBS surfaces described in [42] has been used in the local periodicity approach. The whole designed reflectarray has been analyzed at 12 GHz by the proposed method using 11,222 generalized rooftops in the approximation of the surface density currents induced in the elements of the reflectarray (i.e., the number of unknowns is N b = 36,162). In this analysis the radiation pattern of the feed is modeled as a function cos 10 (θ) for RHCP which provides an illumination level −11 dB below the maximum at the edges of the reflectarray. The feed points to the center of the reflectarray and its center phase is assumed to be located at the point of coordinates (x F , y F , z F ) = (−63.9, 0, 185.5) (mm) with respect to the center of the reflectarray (see Figure 9a). Figure 9b shows numerical radiation patterns in elevation plane for RHCP at 12 GHz obtained by the proposed method with (2N x + 2) × (2N y + 2) = 4096 × 4096 and by local periodicity approach. We can see that there are acceptable agreements between both sets of numerical results. We would like to point out that only 7 iterations and roughly 4.5 min of total CPU time consumption have been required in the whole reflectarray analysis by the proposed method. These results are obtained in a laptop computer with processor Intel Core i7-6700HQ, 2.6 GHz of clock frequency with 32 GB of RAM.

Focused Beam Reflectarray Made of Two Sets of Four Parallel Dipoles with Small Rotations
In order to show validation with commercial software, the focused beam reflectarray designed at 11.95 GHz in [47] has been analyzed by the proposed method. This reflectarray is made of elements based on two orthogonal sets of four parallel dipoles displaced each other a half of period (see Figure 2 in [47] for details). Each set of parallel dipoles is rotated a small angle with respect to the central dipole of the set in order to reduce cross-polar radiation. The reflectarray was designed in [47] to generate a pencil beam pointing at the values of the angular spherical coordinates θ beam = 16.9 • ϕ beam = 0 • with respect to the system coordinate system located at the center of reflectarray (see Figure 10a). The reflectarray is square and consists of in a 20 × 20 grid with cell size 12 mm × 12 mm (i.e., the electrical size of the reflectarray is L x × L y = 9.5λ × 9.5λ at 11.95 GHz in this case). The elements are printed on a Diclad 880 (ε r,2 = 2.17, tanδ 2 = 0.0009) layer of 1.5 mm thick. An Arlon AD255C layer (ε r,1 = 2.55, tanδ 1 = 0.0014) 2.363 mm thick is used as separator with the ground plane. Therefore, according with Figure 1, NC = 2, Q = 2, N 2 = 2 and N 1 = 1. The phase center of the feed is assumed to be located at the point of coordinates (x F , y F , z F ) = (−85.3, 0, 281) (mm) with respect to the center of the reflectarray (see Figure 10a). The feed points to the center of the reflectarray. The radiation pattern of the feed is modeled as a function cos 10 (θ) which provides an illumination level −11 dB below the maximum at the edges of the reflectarray. In this case, 23,548 generalized rooftop are used in the approximation of the surface density currents induced in the elements of the reflectarray (i.e., the number of unknowns is N b = 23,548). Figure 10b shows numerical co-polar and cross-polar radiation patterns in azimuth plane for X and at 11.95 GHz obtained by the proposed method with (2N x + 2) × (2N y + 2) = 4096 × 4096. These results are compared with results obtained by full-wave simulation of the whole antenna with CST shown in Figure 5b of [47]. According with [47], a real feed horn was also modeled using CST instead of cos 10 (θ) function. We can see acceptable agreements between the co-polar radiation patterns obtained with the proposed method and results provided by CST in [47]. However, the cross-polar radiation levels obtained by CST is higher than the cross-polar radiation obtained by the proposed method. This is expected since the full-wave model of the feed horn was carried out by CST. Unlike cos 10 (θ) model, the CST model of feed horn takes into account cross-polar components provided by its own feed horn. The results obtained by the proposed method have 42 required iterations. In this case, a maximum of 6.511 GB RAM memory and 90 s are required when the normalized error ξ has to be evaluated. CPU time consumption of 63 min has been required for the iterative process to reach the threshold ξ th = 0.01. These results are obtained in a laptop computer with processor Intel Core i7-6700HQ, 2.6 GHz GHz of clock frequency, with 32 GB of RAM. According with [47], the analysis of the whole antenna provided by CST took 8 h in a computer with two Intel Xeon Processors (six cores per processor), 2.1 GHz of clock frequency, and 128 GB of RAM. Note that the CPU time consumption by the proposed method is roughly eight times faster than the CPU time consumption taken by CST using a laptop with lower performances than the computer used in [47]. (b) Co-polar and cross-polar radiation patterns in azimuth plane for X-polarization at 11.95 GHz. Numerical results obtained with the proposed method for 4096 × 4096 equi-spaced mesh and CST results given in [47] are shown.

Conclusions
In this work, NURBS surfaces, pulse expansion, and EPP approaches have been implemented for the efficient computation of normalized error defined in iterative BICGSTAB-FFT-MM scheme for whole large multilayer structure analysis problems with several levels with planar metallizations. Moreover, in this scheme, efficient computation of Green's functions for multilayer structure has been also implemented for near and far field region.
As validation of the proposed analysis technique, two types of printed reflectarray antennas described in the literature have been analyzed: single layer shaped beam reflectarray made of cell based on three coplanar parallel dipoles (electrical size of 14.3λ × 14.3λ) and a two-stacked-layer broadband dual polarized reflectarray made of cells based on two orthogonal sets of three parallel dipoles (electrical size of 15.7λ × 15.7λ). The numerical results obtained with the proposed method have been compared with experimental and numerical results provided by local periodicity approach in the literature. In both comparisons, agreements between the results obtained with the proposed method and measurements are acceptable considering the error margin in the antenna layout building and the dispersion of the dielectric constant values. The total CPU time consumption provided by the proposed method in the analysis is from a few minutes (for reflectarray analysis of electrical size of 14.3λ × 14.3λ and 10,629 unknowns) to half an hour (for reflectarray analysis of electrical size of 15.7λ × 15.7λ and 36,162 unknowns) in a laptop computer with processor Intel Core i7-6700HQ, 2.6 GHz of clock frequency with 32 GB of RAM. In order to show the capability of the proposed method for analysis of curved layout printed in multilayer structures, a circular polarized focused beam reflectarray design has been carried out using rotated split rings as reflectarray element (electrical size of 8λ × 8λ and 11,222 unknowns). The reflectarray has been designed and analyzed by the local periodicity approach. The results provided by this last approach have been compared with the results the proposed method. An acceptable agreement has been obtained between both numerical results. In the whole reflectarray full-wave analysis by the proposed method, only 4.5 min of CPU time was required.
Finally, focused beam reflectarray made of two sets of four parallel dipoles with small rotations has been analyzed (electrical size of 9.5λ × 9.5λ and 23,548 unknowns). The results provided by the proposed method have been compared with CST results provided in the literature. Acceptable agreements have been obtained. The CPU time consumption taken by the proposed method is roughly eight times faster than the CPU time consumption taken by CST using a laptop with lower performances than the computer used in the literature to analyze the whole antenna with CST software.