1. Introduction
A digital pulse width modulator (DPWM), which drives digital power amplifiers, is a key building block in many electronic applications, such as DC-DC converters, permanent magnet synchronous motor (PMSM) controllers, and Class D audio amplifiers. In PMSM controller applications, space vector modulation (SVM) is commonly used due to its higher DC voltage utilization ratio compared to a sine PWM (SPWM). As shown in
Figure 1, generating SVM requires a center-aligned PWM (CA-PWM), as it reduces the current harmonics and ensures a fixed current sampling window across three phases [
1].
Advances in wide-bandgap semiconductor devices, such as SiC and GaN, have enabled high-efficiency, high-frequency power conversion in more compact form factors [
2]. Increasing the switching frequency requires a higher timing resolution for a DPWM to minimize oscillations caused by an insufficient voltage output resolution [
3,
4]. A high-resolution PWM (HRPWM) has been proposed to address this issue. Modern MCUs designed for power electronics are equipped with HRPWM modules [
5,
6], which offer a resolution of 50–300 ps. This is sufficient for software-based controllers. For applications that require lower controller latency or hardware acceleration for complex algorithms, FPGA-based digital controllers can be used. However, FPGAs lack a dedicated HRPWM module. Therefore, implementing an HRPWM on FPGAs is crucial.
Several FPGA-based HRPWM designs have been proposed [
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20], each with different trade-offs, as summarized in
Table 1. The tapped delay chain [
7,
8,
9] offers high resolution but consumes significant resources, requires calibration, and operates asynchronously. The Vernier method [
10,
11,
12,
13] achieves sub-ps resolution [
12] and low resource usage but lacks output phase control and suffers from asynchronous timing due to latches in the output stage. Multiphase clocking [
14,
15,
16] has a simple structure but uses extensive clocking resources (normally at least eight dedicated clocks). High-speed transceivers [
17] provide high resolution, though they are unavailable on cost-optimized FPGAs and require external voltage level shifters for signal conversion. IODELAY [
18,
19,
20] is a great substitute for delay chains constructed by cascaded modules. It offers resolution finer than 10 ps [
19] with low resource consumption but has device-specific input constraints for a glitch-free output, limiting its general applicability. While an OSERDES [
14] provides lower resolution (around 1 ns) compared to the other methods, it offers advantages in terms of low resource usage, synchronized timing, and no need for external circuitry, making it ideal for multichannel applications. Additionally, an OSERDES is designed for high-speed interfaces, with a similar architecture across different FPGA families, making it a more generalized solution for HRPWM implementation.
As previously discussed, the increasing switching frequency and the necessity for a CA-PWM in PMSM control applications drive the demand for a CA-HRPWM. While existing FPGA-based HRPWM designs emphasize an ultra-fine timing resolution, PMSM control applications typically require only sub-nanosecond resolution. Instead, optimizing multichannel configurations becomes more critical, favoring low resource usage, no extra circuitry or calibration requirements, and fully synchronized timing.
To address these problems, this paper presents a CA-HRPWM implementation based on the output serializer (OSERDES) in the I/O tile of FPGAs. The design is fully synchronized and requires no calibration or external circuitry. The overhead compared with a conventional timer-based PWM consists only of a single serial clock and the encoding logic of the OSERDES, which ensures minimal resource utilization and suitability for multichannel configurations. The key novelties and contributions of this study are summarized as follows:
- (1)
A new FPGA-based CA-HRPWM module using an OSERDES is designed. The module is integrated with CA-PWM generation to improve on previous implementations based on an OSERDES that only support an edge-aligned PWM. Dead time generation, input value limitation, and fault trigger mechanisms are also implemented to make the module fully functional for practical applications.
- (2)
A more generalized HRPWM solution is proposed to implement an HRPWM across various FPGA platforms. By leveraging the shared structure of OSERDES cells in many FPGA families, this approach reduces the device-specific constraints associated with IODELAY-based designs while maintaining high resolution and low resource utilization.
The paper is organized as follows.
Section 2 presents the architecture and a detailed description of the components within the CA-HRPWM. The implementation of HRPWM hardware and its resource utilization statistics are discussed in
Section 3. The experimental results, including the output pulse measurement and PMSM current control, are presented in
Section 4. Finally, the conclusions are provided in
Section 5.
2. Architecture of CA-HRPWM Using OSERDES
As shown in
Figure 2, the proposed CA-HRPWM consists of four major components: the comparison value calculator, the up–down counter, the unary code encoder, and the OSERDES-based digital-to-time converter (DTC). The comparison value calculator restricts the input duty cycle command to prevent undefined behavior and generates the comparison value to drive the encoder. The up–down counter serves as the time base, defining the period and generating a triangular carrier for encoder operation. The unary code encoder, a key component of the design, determines the unary code of the actual PWM output based on the current counter value and comparison value. Finally, the OSERDES-based DTC converts the parallel unary code into a serial PWM output while integrating a fault trigger protection feature. Since all PWM channels require a common time base, the up–down counter is shared, while the other three components are dedicated to each pair of channels.
In this section, we describe how several bitwise operations are used to manipulate binary data. The logical left shift of a variable is denoted as , where n represents the number of positions the bits are shifted to the left, with zeros filling in the shifted positions. Similarly, the logical right shift is represented as , where the bits are shifted to the right by n positions, and zeros are inserted into the higher bits. A bit slice operation refers to extracting a range of bits from a variable, denoted as , which selects bits from position j to position i. Lastly, the bitwise NOT operation, denoted as , inverts all the bits of the variable.
2.1. Comparison Value Calculator
The comparison value calculator receives the input duty command from the source module and calculates the comparison values for the count-up and count-down processes, respectively. In conventional CA-PWM generation, the comparison values for both the count-up and count-down processes are identical. This results in half the resolution of an edge-aligned PWM under the same clock frequency, since a single-unit increase in the input comparison value leads to a two-unit variation in the duty cycle. To prevent this loss of resolution, the comparison value patterns are restructured from to in the proposed design.
The comparison values
and
are calculated as
Note that the counter decrements from
to one instead of zero, with each count corresponding to eight least significant bits (LSBs) in the comparison value. So, the
in Equation (
2) ensures proper OSERDES operation during the count-down process.
Considering dead time generation for active high complementary outputs, the rising edges of both outputs are delayed by
, while the falling edges stay the same. The delayed comparison values,
and
, can be derived from
To avoid undefined behavior caused by data corruption near zero or a full duty cycle, the input data range must be limited based on the dead time
and PWM period
. The acceptable range is specified by
The simplified lower limit
and upper limit
for
are as follows:
2.2. Up–Down Counter
The up–down counter value
continuously increments from zero to
, then decrements from
to one, forming a triangular carrier that serves as a time base, as illustrated in
Figure 3. To ensure proper operation,
must be a multiple of 16.
2.3. Unary Code Encoder
The unary code encoder takes the comparison values of all channels and splits them into two parts. The higher bits (most significant bits (MSBs) to three) are compared with the up–down counter value to determine when to activate the encoder. When the higher bits of the comparison value do not match the counter value, the encoder remains inactive, resulting in a constant output of either 0xFF or 0x00. This means the output stays entirely high or low during the counter clock period. When the higher bits of the comparison value match the counter value, the lower bits (two to zero) are encoded and fed into the OSERDES to fine-tune the output edge position.
For a positive output, when the counter is counting up, the output value
is
Similarly, when the counter is counting down, the output value is
For a negative output, the output value is inverted, and the rising edge while counting down is delayed, giving the following output value,
, when counting up:
Similarly, when the counter is counting down, the output value is
2.4. OSERDES-Based DTC
The OSERDES module in an FPGA is a dedicated parallel-to-serial converter designed to enable high-speed, source-synchronous interfaces.
Figure 4 presents a simplified view of the OSERDES module. Each OSERDES module contains a serializer that converts parallel data from a low-speed clock domain into serial data. The least significant bit (LSB) of the input parallel data is serialized first, followed by the remaining more significant bits. The serializers can be configured in single-data-rate (SDR) or double-data-rate (DDR) mode, supporting data serialization ratios of up to 8:1 in DDR mode. The OSERDES module requires a reset before operation, and the output data are driven low when the reset input is asserted.
Aside from its primary use in high-speed interfaces, the OSERDES module can also function as a digital-to-time converter (DTC) when fed with the appropriate input vector. Each I/O tile contains a dedicated OSERDES module, ensuring that the OSERDES modules used on different pins do not interfere with one another, making them ideal for a multichannel output.
In the proposed design, the OSERDES operates as a DTC, serializing
to form the actual CA-HRPWM signal, as shown in
Figure 5. Note that the
are registered for one CLKDIV cycle to ensure that the timing constraints are met. Additionally, the internal delay of the OSERDES module is two CLKDIV cycles when operating in 8:1 DDR mode, resulting in a total delay of three CLKDIV cycles.
When the trigger port receives a failure signal indicating a fault condition like an overcurrent, the trigger signal is latched, and the OSERDES output is set to low after one CLK cycle. The output remains low for another CLK cycle after the arm signal arrives, which resets the triggered state to armed, as represented in
Figure 6.
3. CA-HRPWM Simulation and Implementation
3.1. Software Simulation
To verify the functionality of the CA-HRPWM module, a simulation was performed using Xsim in Vivado 2023.2. To accelerate the simulation and demonstrate more timing details, the PWM period was set to 64. As shown in
Figure 7, the dead time was correctly set to 4 ns, and the output pulse width accurately reflected the input comparison values.
3.2. Hardware Implementation
For a performance evaluation, the proposed CA-HRPWM was implemented on a Zynq-7000 SoC XC7Z020CLG400-2 (Xilinx, San Jose, CA, USA) using the Vivado 2023.2 EDA tool with the default settings. The Zynq-7000 SoC is an integration of two parts: programmable logic (PL) and a processing system (PS). The programmable logic is equivalent to any other Artix 7 series FPGA, and the processing system consists of dual ARM Cortex-A9 cores. The PL and PS have several Advanced eXtensible Interface (AXI) connections between them to perform data transfer. The proposed CA-HRPWM was implemented in hardware on the PL, while communication with the host computer and data loading to the CA-HRPWM module were managed by bare metal software running on the PS. The source code and the corresponding flowchart of the proposed CA-HRPWM are provided in
Appendix A.
As for the timing parameters, the clock (CLK) frequency was set to 625 MHz, and the divided clock (CLKDIV) frequency was set to 156.25 MHz to achieve an optimal resolution. The PWM period and dead time were configured to 16,000 and 5, respectively, in accordance with the module parameters used in the experiments.
To evaluate the resource overhead of the proposed CA-HRPWM design, we implemented a conventional counter-based CA-PWM module with the same clock frequency, PWM period, dead time, and channel number configuration as the proposed design. The comparison module operated at 156.25 MHz, the same clock frequency as that of the CLKDIV in the HRPWM module. The corresponding resolution was , which was exactly eight times the 800 ps resolution of the HRPWM module. This allowed for a direct comparison of resource usage while maintaining identical functional behavior.
We also implemented the IODELAY-based CA-HRPWM design proposed in [
20] for comparison. As noted in the Introduction, although its applicability is limited by the glitch-free input limitations of the IODELAY module, this design remains well suited for multichannel applications. The clock (CLK) operated at 390.625 MHz, while the reference clock (CLKREF) for delay chain calibration was set to 195.3125 MHz, ensuring compliance with device specifications and achieving an 80 ps resolution. Note that the 10-times-higher resolution achieved by the IODELAY-based method compared to our module benefits power supply applications, where switching frequencies could reach tens of megahertz. However, this level of resolution is unnecessary for PMSM control, as switching frequencies typically remain below 1 MHz.
The slope of linear regression for the usage of lookup tables (LUTs), flip-flops (FFs), and carry chains (CARRY) per channel was used as a metric for resource utilization, as it quantifies the rate at which resource consumption increases with the number of channels. This approach provided a clear indication of how efficiently the proposed CA-HRPWM design scales as more channels are added.
The resource usage for configurations ranging from 2 to 12 channels is presented in
Figure 8. The linear relationship between the number of cells and channels holds for both the proposed and comparison modules, confirming that the slope is a valid metric for the resource utilization per channel.
The average additional resource usage per channel for configurations ranging from 2 to 12 channels is shown in
Table 2. The numbers of LUTs, FFs, and CARRY for each additional channel were 47, 28, and 15, respectively. These values were only 1.8 to 3.2 times as high as the resource usage of conventional CA-PWM implementations. Compared to the IODELAY-based CA-HRPWM, the values were further reduced by 15% to 36%. The XC7Z020CLG400-2 device provides 125 I/O ports in programmable logic (PL). Based on the results, we estimate that even with a maximum configuration of 125 channels, the total utilization rate of the resources would be approximately 14%. This demonstrates the high efficiency of the proposed CA-HRPWM design, which makes it well suited for multichannel applications.
The power estimation for configurations ranging from 2 to 12 channels is presented in
Figure 9, while the average additional power consumption per channel is detailed in
Table 3. For the proposed design, the extra power consumption per channel was 5.4 times that of a conventional CA-PWM and 1.8 times of the IODELAY-based CA-HRPWM. Despite this increase, the power consumption remains acceptable for PMSM control applications, as the motor itself operates within a power range of tens to thousands of watts.
5. Conclusions
In this paper, we proposed a center-aligned high-resolution PWM (CA-HRPWM) implementation based on the OSERDES module in FPGAs. The design leverages the OSERDES module as a digital-to-time converter (DTC), achieving high-resolution pulse generation with minimal resource utilization and ideal linearity, making it suitable for multichannel applications. The experimental results demonstrate that the proposed CA-HRPWM achieved a resolution of 800 ps with low jitter and high linearity. It also significantly improved the current control performance, reducing the current ripple by up to 64% compared to a conventional PWM generator. This approach offers a promising solution for high-performance multichannel CA-PWM generation in FPGA-based power electronic systems, with potential applications in motor control, DC-DC converters, and other digital power systems.
To further enhance the performance and broaden the potential applications, future research may focus on supporting diverse PWM waveforms for various power electronic systems and integrating the proposed method with IODELAY on specific FPGAs to achieve a higher time resolution.