Multipoint Detection Technique with the Best Clock Signal Closed-Loop Feedback to Prolong FPGA Performance

The degradation effect of a field-programmable gate array becomes a significant issue due to the high density of logic circuits inside the field-programmable gate array. The degradation effect occurs because of the rapid technology scaling process of the field-programmable gate array while sustaining its performance. One parameter that causes the degradation effect is the delay occurrence caused by the hot carrier injection and negative bias temperature instability. As such, this research proposed a multipoint detection technique that detects the delay occurrence caused by the hot carrier injection and negative bias temperature instability degradation effects. The multipoint detection technique also assisted in signaling the aging effect on the field-programmable gate array caused by the delay occurrence. The multipoint detection technique was also integrated with a method to optimize the performance of the field-programmable gate array via an automatic clock correction scheme, which could provide the best clock signal for prolonging the field-programmable gate array performance that degraded due to the degradation effect. The delay degradation effect ranged from 0◦ to 360◦ phase shifts that happened in the field-programmable gate array as an input feeder into the multipoint detection technique. With the ability to provide closed-loop feedback, the proposed multipoint detection technique offered the best clock signal to prolong the field-programmable gate array performance. The results obtained using the multipoint detection technique could detect the remaining lifetime of the field-programmable gate array and propose the best possible signal to prolong the field-programmable gate array’s performance. The validation showed that the multipoint detection technique could prolong the performance of the degraded field-programmable gate array by 13.89%. With the improvement shown using the multipoint detection technique, it was shown that compensating for the degradation effect of the field-programmable gate array with the best clock signal prolonged the performances.


Introduction
Field-programmable gate arrays (FPGAs) are eminently utilized to develop the latest complementary metal oxide semiconductor (CMOS) technology due to their high volume density, scalability, and ability to cope with the highest performance demands for digital and mixed-mode analog applications. However, after the rapid downscaling in FPGAs, the device encounters challenges, including increased noise sensitivity, manufacturing variability, and reliability concerns [1][2][3][4]. These factors escalate the degradation of the chip performance, which eventually reduces the lifespan of the product. Numerous reports elaborate on delays that occur in FPGAs, which are detrimental to chip performance [5][6][7][8][9][10]. The optimization of delays has been studied by using key testing parameters, such as lookup tables (LUTs) and algorithm analysis [11,12]. L. Bauer et al. studied the dependency of application types with fault delay [13]. The prediction of interconnect delays was studied by V. Manohararajah et al. [14]. It was deduced that the timing model is beneficial for the physical synthesis flow of an FPGA design [15,16].
It has been reported that the delay is the effect of a negative bias temperature instability (NBTI), which is a critical factor of FPGA degradation [17][18][19]. S. Chakravarthi et al. [20] developed a framework that caused a delay of less than 500 ms. Previous studies primarily concentrated on the NOR gate chains, which are found to reduce delay degradation [21]. The effects of delay on switching activity were discovered by D. Alnajjar et al. [22], who reported on the analysis of the reduction of the circuit speed. Meanwhile, M. Cho et al. [23] believed that the delay component is a temperature-activated process in irradiated devices. S. Cha et al. [24] observed that the delay is caused by an increase in the current and subsequent redistribution of the voltage. The nominal delay degradation for the ISCAS Benchmark indicates that the delay degradation in semiconductor devices can be as low as 617 ps [25].
Seven output ranges of a phase-detection aging sensor with a lifetime prediction table for different FPGAs manufacturers are proposed to guide system designers and industrial players through simulation and experimental validation. After that, an automatic clock correction method was proposed and validated through experiments to stabilize the system's performance with the best clock signal for the entire FPGA system to counter the aging degradation and maintain the system's performance.
This paper is organized as follows. First, Section 2 describes the lifetime reliability sensing in FPGAs and Section 3 describes the methodology of analyzing aging sensors with multiple point detection. Then, Section 4 compares multiple types of delay frequencies, which was validated through the experimental work presented in Section 5. Finally, the conclusion is presented in Section 6.

Lifetime Reliability Sensing in FPGAs
One of the first effective aging sensors developed was that proposed by C. Leong et al. [26], which detects delays in circuits. However, the method is unable to characterize the delay using different amounts of phase delay. Meanwhile, the aging sensors that were proposed by A. Amouri et al. [27] place the aging sensor in a critical path to avoid late transitions occurring in the circuit. The aging sensor [28,29] and FPGA chips were reprogrammed to reduce the late transition effects caused by NBTI [30] and hot carrier injection (HCI). Meanwhile, M. Valdes-Pena proposed an aging sensor without determining a specific delay range [31]. Due to the limitations listed above, an enhanced aging detection system for particular ranges of phase delays is needed to effectively detect aging in FPGAs. This work proposed a lifetime prediction method such that it implements multiple phase detection.
The FPGA lifetime prediction technique proposed in this work uses the aging sensor detection circuit, which implements the multiple-phase delay capability using a clock generator. The method employs a sensor detector with seven phase delay ranges, consequently estimating the delay occurring in the chip. This information is eventually used to obtain the remaining lifetime of the FPGA while operating with a critical level indicator for replacement.
Lifetime Reliability at the Transistor Level M. Ebrahimi et al. discovered the importance of lifetime reliability at the transistor level [32]. The importance of lifetime reliability was supported by G. Sai et al., who found that aging sensor detection is caused by multiple path delays generated at the transistor level [33]. The Eldo simulator tool was used in this work to predict the lifetime reliability in a 23-stage ring oscillator [34][35][36]. The simulation flow using the Eldo simulator is shown in Figure 1. The 45 nm high-power model for positive metal oxide semiconductor (PMOS) and negative metal oxide semiconductor (NMOS) transistors from the predictive technology model (PTM) [37] was used in the 23-stage ring oscillator. The ring oscillator circuit was simulated as a new circuit without an aging property to obtain the oscillation frequency. The new simulation was conducted as a reference point for the zero-degradation effect. The behavioral characteristics of the ring oscillator circuit were verified through the simulation output. The Eldo simulator then employed an advanced reliability model, which took the HCI and NBTI of CMOS devices into account in order to simulate 10 years of aging effects in the circuit. These aging simulations were conducted for the first year until ten years of aging circuit to determine the trend of the device degradation. The models used in this work were based on reaction-diffusion theory [38] and were multimode energy driven [39].  Figure 2 shows the frequency degradations for 45 nm and 32 nm technologies on a ring oscillator for 27 • C and 100 • C for up to 10 years of aging, which were obtained from the Eldo simulator. This frequency degradation shows that the frequency increased after aging due to the HCI and NBTI mechanisms at 27 • C and 100 • C, respectively [40]. Yoohwan Kim et al. found that the NBTI effect is a dominant factor when the temperature is high, and the effect can be suppressed for lower temperatures [41,42]. Hui Zhang et al. reported that under stressed conditions of PMOS transistor exposed to a gate voltage of 0 V to 2.5 V and a high temperature of 100 • C, new interface states of Si-SiO 2 cause electric potential augmentation [40]. Furthermore, the threshold voltage causes a negative drift because of fixed charge and interface states caused by a positive hole capture. A reactiondiffusion model was well explained by Wenping Wang [43], including the principle of the NBTI effect, which reported the breaking of the Si-H bond in the Si-SiO 2 interface into interface traps [44,45]. It was shown that these findings support this work due to frequency variations that were caused by the HCI and NBTI degradations.
D. Sengupta and S. Sapatnekar obtained a threshold voltage shift due to NBTI [46] as given in Equation (1): where C x is a constant that depends on the process, voltage, and temperature (PVT); t st is the effective stress time after the elapsed time between t 0 and t. The effective stress time for PMOS devices after NBTI stress with a LOW input signal depends on the stress probability value of the device. Therefore, Equation (  Equation (2) can be used to calculate an unknown frequency degradation that is not plotted in Figure 2. As was deduced from Equation (2), the percentage of frequency degradation against new devices for 5 years and 6 months was equal to 0.723% frequency degradation due to the NBTI effect.
The HCI degradation mechanism is more influential in NMOS than PMOS transistors [48]. HCI occurs when carriers inside the channel are subjected to the adjacent electric field, producing sufficient energy and momentum to break the barriers surrounding the dielectric [46]. The carriers with adequate energy for the HCI stress will result in interfacestate generation at the Si-SiO 2 interface [49]. Furthermore, D. Sengupta and S. Sapatnekar produced an equation of the threshold voltage degradation due to the HCI [46], as given in Equation (3): where C H and E 0 are the process dependent parameters, E ox is a vertical field, q is the electronic charge, φ it is the trap generation energy, λ is the hot electron mean free path, E m is the lateral electrical field, and g is effective stressing time. The adequate stressing time caused by HCI aging depends on the number of switching events of the transistor. It can be calculated using (AF·F clk ·t), where AF is the activity factor of the transistor, F clk is the clock frequency, and t is the elapsed time. Therefore, D. Sengupta and S. Sapatnekar combined the degradation effect of the NBTI and HCI onto a shift in delay ∆D(t) for logic gates in detail [46], as stated in Equation (4): where K B = ∑ i∈N MOS S n,i ψB n,i + ∑ i∈PMOS S p,i ψB p,i and K H = ∑ i∈N MOS S n,i ψH n,i . By using the CurveExpert ® (Chattanooga, TN, USA) [47] software, the frequency degradation due to the HCI at 27 • C could be calculated using Equation (5) As was deduced from Equation (5), the frequency degradation caused by the HCI for 8 years and 3 months was equal to 0.542% frequency degradation compared to new devices.
In summary, the results indicate that the HCI and NBTI mainly caused the degradation effect, which is supported by C.Q. Liu et al.'s study, which reveals that the delay degradation effect due to the HCI and NBTI led to device aging [50]. However, most studies in the open literature have not explicitly examined the 45 nm technology node. Therefore, this simulation work using the Eldo simulator mainly focused on reliability concerns due to the HCI and NBTI degradation effect of a 23-stage ring oscillator for a 45 nm technology node. Figure 3 presents the proposed flow of aging sensor detection. The aging sensor detector has two main inputs, which are the clock (Clk) and voltage (Vi); the clock skew effect is negligible [51,52]. This detection circuit will produce seven ranges of phase point detection ranging from 0 • /360 • to 300 • with a 50 • resolution. Phase point detection is a technique that segregates the detection process of the delay signal into several types of frequency detections. Each detection process ranging from 0 to 49 • represents level 0, while the phase detection range of 50-99 • represents level 50, and so forth. The total delay time for level 0 was from 0 s to 1.39 ns, and level 50 was from 1.39 ns to 2.78 ns. Therefore, the detection process was applicable only if the frequency of Vi was less than the frequency of the Clk such that it produced a successful detection. The Clk was synchronized to the clock of the FPGA circuit using a clock generator that provided a stable clock for the entire system. The input signal originated from an external clock, and the output produced was a synchronous clock called clk_cntrl, which triggered the Clock Delay and circuit under test (CUT) modules. The Clk may originate from the various FPGA manufacturers with different intellectual properties (IPs). The Xilinx FPGA used in this work has a reset input and locked outputs that are used for initialization purposes. There is also an indicator called a clocking wizard (CW) signal that is fully stable. The Xilinx FPGA board provides an IP that requires 69 clock cycles to stabilize the clock. For the proposed aging sensor circuit, the programmer requires the CW to conduct a frequency synthesis and phase alignment to manipulate the frequency and duty cycles of the synchronized clock. Based on this CW specification, two clock inputs served as the primary and secondary clocks, respectively, ranging from 10 MHz to 700 MHz. The maximum outputs produced were seven clock outputs with frequencies ranging from 5 to 700 MHz and phase degrees between 0 and 360 • . The CW initial setup provides the desired and actual frequency for input and output frequency configurations [53]. This initial setup is essential for the accuracy of the design for the multiple-point phase degrees chosen. For the 40 nm Virtex-6 FPGA board, the maximum output capabilities of the CWs can produce up to seven different frequencies [53]. However, the CW configurations for other FPGA manufacturers have varying limitations and settings. The CW can be rearranged in series or parallel or a combination of series and parallel for better phase-detection ranges to cater to this matter.

The Design Process for an Aging Sensor with Multiple Points of Frequency Detection
For the proposed aging sensor, the duty cycles were set to 50%. Therefore, only a single clock with the same frequency is used for the input and output of this clock generator circuit. The clock delay module uses a separate CW to generate seven different output phase delays sequentially. It used clk_cntrl as the input and clk_x at the output, where x represents the phase shift frequencies. The phase shift frequencies were set to 0 • /360 • , 50 • , 100 • , 150 • , 200 • , 250 • , and 300 • . All shifting in the phases acts after the clk_cntrl signal. The CUT and the clock delay modules subsequently feed their respective signal into the comparator module. The comparator module consequently obtain the correct correlation between the signals it received from the CUT module with the phase shift frequencies fed from the clock delay module. Each correlated phase shift frequency corresponds to a particular range of the remaining lifetime.
The CUT module consists of a Vi input, a clocking input (clk_cntrl), and an output (Q d ). A D flip-flop is used to load a single bit of data into the FPGA in this work. The process of this module works as either positive gate triggered (PGT) or negative gate triggered (NGT), depending on the design requirement. The output of this CUT module is an input to the comparator module, which is the subsequent module shown in Figure 4. Within the comparator module, Q d is fed into two distinct sets of logic circuit elements; a group of memory elements from the CUT included the delay called Clk_Delay_x and another CUT set with the original signal input Vi. Signals from these two elements were then compared to detect whether any phase difference had occurred. A phase difference implies the occurrence of a delay. For example, if a phase difference between Q d and Clk_Delay_x is detected, Delay_x produces the output HIGH. The Delay_x output and Clk_Delay_x was fed into the last module called the phase shift detector module to identify the range in which the phase delay occurred.    Figure 6 shows an example of the timing diagram obtained from the phase shift detector module for the delay detection of 50 • and 250 • . Initially, the synchronized clock clk_cntrl was fed to the CUT and clock delay modules. Next, the output produced from the CUT module Q d was compared with different phase shifts inside the comparator module. Then, the injected delay on the input signal Vi with a delay of 108 • phase shift is illustrated for the detection process of the phase shift detector module. Finally, the Delay_x obtained from the Comparator circuit module was checked and tested using seven types of Clk_Delay for this injected delay signal. Therefore, the sensor types for each phase shift were HIGH for sensor 50 • and LOW for sensor 250 • .

Aging Detection Process
The process for detecting the delay is illustrated in Figure 7. The clock signal frequency is initially set for the system to be operated on a PGT or NGT, depending on the system requirement. The Vi is subsequently injected with a random delay signal to test the aging detection circuit. In the Verilog coding, the injected delay signal is present in the time unit using the '#' symbol. For example, '#3 equals three time units, equal to a 10.8 • phase difference from the original signal. This injected delay signal signified the degradation caused by the variation of the threshold voltage, carrier mobility, NBTI, and HCI, as discussed in the previous section. Therefore, when the aging sensor detects the adjacent injected delay signals, these injected delay signals will fall in the same phase delay range and produce an identical output category. As shown in Figure 7, the Vi signal was preset with a delay of 220 • for θ 1 and a Clk_Delay of 200 • for θ 2 . Consequently, both θ values will be compared with the Vi signal independent of whether it exceeds the Clk_Delay frequency. To obtain an aging value, θ 1 should be greater than θ 2 . For example, the 220 • delay (Vi signal mentioned above) exceeded the 200 • delay, which subsequently caused the delay detection to go HIGH. The output triggered a hex output signal as a reference value. The hex output signal is a data representation that is based on seven detection points, as previously described in Figure 5. The results obtained were in binary format and converted to a hexadecimal format called hex for a more straightforward implementation for the following process. However, if the value of θ 1 is less than or equal to θ 2 , the aging sensor will produce a value of 0, indicating the aging condition is null. Table 1 illustrates a sample of the detection process for the aging of the circuit using a sample Vi with delays of 220 • and 270 • . For θ 1 = 220 • , the Clk_Delay values for 50 • , 100 • , and 150 • were HIGH, while other frequencies were LOW, hence producing an output of 3C 16 . Conversely for θ 1 = 270 • , the value of hex was 3E 16 due to the Clk_Delay being HIGH for θ 2 equal to 50 • , 100 • , 150 • , 200 • , and 250 • .

Proposed Automatic Clock Correction
The delay degradation on the FPGA is continuously affected [54,55]. Therefore, it is crucial to correct the circuit automatically. Although the correction does not fully recover the chip, it is essential to maintain the efficiency and stability to provide a correct output.
Automatic clock synchronization is a process of a centralized system that depends on the synchronized clock input. The clock synchronization is widely used in a wireless sensor network for correction purposes. Recent studies proposed several methodologies for performing clock correction, along with error correction for circuit design. M.G. Batarseh et al. digitized an analog signal and compared the signal to a reference value to maintain a near-zero error signal while focusing on the duty cycle obtained from the digital clock manager [56]. T. Anwar et al. supported this finding by developing a built-in error correction for the same clock cycles in FPGA cells [57]. Two clock correction algorithms called a Kalman filter and weighted-average-based NIST AT1 were investigated by A. Zenzinger et al. to detect failures and perform corrections from developed clock monitoring and a control unit [58]. This present work focused on automatic clock correction for an aging circuit, as shown in Figure 8. The circuit is illustrated in Figure 3 as a CUT module that aged or was injected with a substantial amount of delays was used to demonstrate the correction process. If aging is detected, the user can select either to enable the automatic clock correction input or not. The input signal then passes through the automatic clock correction circuit, which performs as a feedback circuit. The output waveform needs to be validated on the timing diagram to show that the correction process was made correctly. The output must indicate that there is no aging delay detected after a correction has been made. In order to validate this automated clock correction design, the proposed circuit was tested and evaluated experimentally for various types of combinational and sequential logic circuits, such as encoders, decoders, adders, and multiplexers. A top module block diagram for the automated clock correction circuit is shown in Figure 9. The module consists of four inputs called Vi, enable, Clk, and rst and produces an output called out_as. First, the circuit is synchronized using a global clock that is fed with a standard clock signal Clk. The output of this global clock is fed into multiple clock generators, an automatic clock selector, the circuit under test, and enhanced phase shift detector modules. The combination of comparator and phase shift detector modules from Figure 3 are integrated into this enhanced phase shift detector module. Subsequently, the multiple clock generators module generates various types of stable clock signals, which were 0 • /360 • , 50 • , 100 • , 150 • , 200 • , 250 • , and 300 • . This module also has a reset signal rst for safety precautions and an input signal for the initialization process. An automatic clock selector module selects the appropriate clock signal input according to the circuit specifications. By default, a phase of 0 • /360 • is set to trigger the circuit. This module consists of three inputs and produces a single output. The input signals originate from a previous module enable and a feedback signal from the top module output. The enable signal is used as an enable input to trigger the automatic clock correction. Meanwhile, the feedback signal provides the LUT values based on the detected signals from the corresponding seven detection points.
The CUT circuit can be substituted with various combinational logic circuits, such as encoders, decoders, and multiplexers. Thus, it is necessary to ensure that the number of bits used for inputs and outputs is the same for other modules. In addition, this automatic clock correction technique is applicable for any type of circuit, whether simple or complex. The Vi input signal is an input injected with delay signals. The rst input is used for the initialization process. The enhanced phase shift detector module consists of four inputs and a single output. The inputs Vi, clk, rst, and a reference signal from the CUT produce out_as with seven detection points. The input signal of Vi is compared with the reference signal to obtain the phase difference. This comparison is evaluated in terms of the point and phase that had caused the delay. Every bit of this enhanced phase shift detection is stored inside the memory element called hex, which is used as the feedback signal for correction purposes.
The lifetime of the FPGA can be extended by implementing this automated clock correction design via identifying the aging degradation followed by clock correction immediately. For example, if the remaining lifetime of the device is 12 months, it is possible to extend the lifetime for more than a year. In particular, the aging sensor detection process decayed the aging condition of the FPGA devices during the correction mode. The lifetime and automatic clock correction can be monitored based on the signal input called on. The input signal can be enabled periodically to reduce the delay degradation of the FGPA. For monitoring purposes, the switch can be turned off momentarily to view the remaining degradation. The delay cannot be eliminated; however, the usefulness of the device can be prolonged.

Comparison of Multiple Types of Delay Frequency
The aging sensor circuit, whose schematic is presented in Figure 3, produces a single output without aging and six outputs that indicate the delay that occurs at different clock frequencies and offer the aging status of the FPGA. The initial detection phase is from 0 • and ends with 300 • with intervals of 50 • in between. All six of these phase delay detections will go HIGH if the CUT produces any form of delay. For example, if the CUT generates a delay of 108 • , the indicators for 50 • and 100 • are to equal '1,' while the indicators for 150 • , 200 • , 250 • , and 300 • will show a value of '0,' as shown in Figure 10. The design was tested using an exhaustive simulation test for all possible delays from 1.0 • to 359 • , as shown in Table 2. The results were created in binary and translated into hexadecimal values for more accessible analysis. For the delays between 1.0 • and 50.0 • , the output remained at 00 16 . The output only started to change when the delay was between 51.0 • and 359.0 • . The tabulated results show that when the delay occurred between 51.0 • and 100.0 • , the output value was 20 16 , while if the delay was between 201.0 • and 250.0 • , the output was 3C 16 , and so forth.  The simulation test discovered all possibilities of delay caused by the HCI and NBTI degradation effects that were monitored through critical path delay, interconnect delay, and propagation delay [59][60][61][62]. M. Sheng and J. Rose examined all these types of delays and found that the propagation delay on the FPGA is 41.3%, while the interconnect delay is58.7% [63]. These findings are supported by V. Manohararajah et al., who found that the interconnect delay was between 60 and 70% of the total delays [14], which leads to the lifetime degradation of the FPGA.
Based on the seven detection points of the delay degradation, the manufacturer of the FPGA provides a maximum lifespan of the FPGA as a reference purpose only [64] because the percentage utilization of the CLBs depends on the digital system developed. According to the Xilinx manufacturer of the Virtex-6, the lifetime for the 40 nm FPGA technology is 80 months [65]. Compared to other manufacturers, such as Altera and Lattice Semi, they reported that the lifetimes of their products Startix IV and ICE40L are around 36 months [66] and 24 months [67], respectively. Based on this information, the lifetime prediction was tabulated, as shown in Table 3. For example, if the delay detection was between 101 • to 150 • , the lifetime of the Xilinx FPGA will have 58.33% of its lifecycle remaining. In other words, the FPGA can be used for another 47 months. According to the desired lifetime specification, this calculation method will help the programmer replace the FPGA if necessary. The complete lifetime prediction calculated for each clock phase delay is shown in Table 3. The following formula was used to calculate the delay: where %d is the percentage of delay and f d is the range of the phase delay. For example, if f d is between 101 • and 150 • , %d is equal to 41.67% by using the maximum value within this range, which is 150 • . Subsequently, the following formula was used for calculating the remaining lifetime: where T r is the time remaining for the FPGA and L M is the maximum lifetime data provided by the FPGA manufacturer. For instance, with regard to the 40 nm Virtex-6 FPGA board with a %d of 41.67%, the L M is equal to 80 months; the calculated value of T r is equal to 47 months.  Figure 11 shows the output for the automatic clock correction aging sensor. For this proposed correction circuit, the enable input was set to 8150 ns. It is apparent from this figure that before the automatic clock correction was triggered for t < 8150 ns , the output waveform was in degradation mode. Interestingly, after t > 8150 ns , the out_as signals stabilized with no aging detected when enabled input was switched ON. These signals are highlighted in red. The automatic clock correction circuit was activated if the enable input was HIGH. During the activation period, the system will compensate for an appropriate clock signal input based on the occurrence of the delay. For example, based on the asterisk symbol (*) in Table 2, if the current delay phase was 108 • , the clock signal for the system will automatically switch to 100 • . Likewise, suppose the delay phase changes to 220 • after several months or years. In this case, the clock signal will also automatically feed the system with 200 • , which is as close to the delay phase without interruption by the user/programmer/designer. This process continues as long as the enable input remains HIGH for 360 • delay degradation, covering all possibilities. Figure 12 illustrates aging sensor detection and automatic clock correction processes with valid and invalid correction methods. Initially, the signal input Vi was injected with a 108 • phase shift delay with the original clock input of a 0 • phase shift. Since the 50 • and 100 • phase shifts of the aging sensors were less than the injected signal of the 108 • phase shift, both of these sensors produced HIGH output signals. On the other hand, sensors 150 • /200 • /250 • /300 • had LOW output signals. The automatic clock correction was performed by triggering the enable input and a new clock signal of 100 • phase shift was applied to the entire system. All aging sensors thus produced LOW output signals with a new clock signal used. If the new clock signal is higher than the signal input delay (Vi), all sensors will produce HIGH output signals. This invalid correction method is illustrated in Figure 12 with a clock signal with a 150 • phase shift. Regarding the feedback signal obtained, the system performance was stable and did not degrade further. Therefore, the proposed automatic correction method can be beneficial for critical applications, such as medical, security, and satellite applications that require the system to be constantly stable. The design was tested for different sequential logic circuits, specifically T-Flip flops, JK Flip-flops, and ring counters, to verify that the design was applicable in various circuits. The testing method varied the number of bits used to cater for the signals for data, bytes, and words. It was observed that the proposed correction method could be considered as a guideline or a design consideration for system designers and industrial players to prolong the lifetimes of FPGA applications.

Experimental Validation
This section describes the validation process of simulation testing through experiments. After the simulation was implemented onto the FPGA board, the bit file was generated and downloaded onto the 40 nm Virtex-6 FPGA chip. The input and output configuration is shown in Table 4. First, the rst input was enabled for 1 s, followed by the Vi input signal. Subsequently, the automatic clock correction input called enable triggered the clock correction procedure for the entire FPGA system. This process was repeated for 100 random samples without a specific injected delay signal to validate the proposed design. The output observation on the 40 nm Virtex-6 ML605 FPGA board is shown in Figure 13. Figure 13a shows the initial setup for the experiment. Initially, all light-emitting diodes (LEDs) were in the OFF condition. Figure 13b shows that the rst input pin was enabled. This function was used to reset the entire FPGA system for preventive purposes. Figure 13c shows the input signal Vi as being enabled, which generated the random percentage delay aging to the multiple-point aging sensor module. For example, when the percentage delay was between 301 • and 359 • , the LEDs showed the values of 3F 16 . The LEDs were turned ON according to the aging detection, as stated earlier in Table 2. Finally, Figure 13d represents the automatic clock correction mode, whereby all LEDs were turned OFF and provided the best clock signal to the entire FPGA system. Vi enable rst out_as [6] out_as [5] out_as [4] out_as [3] out_as [2] out_as [1] out_as[0] The percentages of hex output hits for the random signal generated for Vi are shown in Figure 14. The aging simulation was done for all possibilities of aging delays ranging from 1 • to 359 • , as illustrated in Figure 11. Furthermore, the exact design of the proposed aging sensor was embedded into FPGA, as described in Figure 13. The percentages of output hits were obtained through an experimental study. Initially, the Vi signal was randomly generated between 1 • and 359 • of aging with the delay signal. Then, this process was repeated and recorded with all potential aging signals generated. For the consistency verification of the aging sensor, the distribution of the measured seven hex outputs was repeated three times. From the chart, the coefficient of variation and standard variation were calculated to be 0.02% and 1.33%, respectively. Since the standard deviation was low, this implies that the data was close to the mean value, which verifies the reliability of the aging sensor proposed.  Table 5 provides the mean percentage hit for different kinds of hex outputs, representing seven points of phase detection in line with the aging degradation based on the result obtained in Figure 14. The example stated earlier shows that the value of 3F 16 occupied approximately 13.90% of hits during the experimental testing. It shows that the design could be implemented for different aging processes. The results obtained covered all types of detection ranging from 0 • to 360 • according to the hex output. As discussed earlier in Figure 2, the lifetime degradation was proportional to the delay degradation for the ring oscillator. It is known that the fundamental ring oscillator is constructed using CMOS, which is the same as the FPGA architecture. Therefore, Table 3 supports the idea that the percentage delay will increase due to the aging of the device according to the provided expected lifetime from the manufacturer. In addition, the experimental results show that the degradation in the FPGA was in line with CMOS degradation.
In summary, the evidence shows that the related degradation mechanism caused the FPGA to degrade while operating. The comparison of the aging sensor design with an automatic clock correction scheme is tabulated in Table 6. The proposed aging sensor with seven detection points could reduce the logic delay degradation on the 40 nm Virtex-6 FPGA. Thus, the experiment validated the result that the complexity of the multiplepoint aging sensor with an automatic clock correction scheme is lower compared to other methods that use critical memory elements, several types of flip flops, and programmable delay circuits.

Conclusions
Based on the simulation framework for the 40 nm technology node via a reliability simulator called the Eldo simulator, the effect of delay degradation on the FPGA was studied, along with aging sensor detection. The simulation results obtained from the Eldo simulator showed that the aging degradation over ten years shows a degradation trend of between 0.04 and 0.09% per year due to the NBTI degradation effect. Therefore, an aging sensor was implemented that detected seven points of aging degradation based on the phase shift delay. The results obtained from this aging sensor show that the degradation rate was 13.89% for each phase shift, which is in line with the NBTI degradation effect on the FPGA. For the optimization of the FPGA performance, the usage of the automatic clock correction scheme was vital to ensure that the output produced by the FPGA was consistently at the optimal level with a minimal delay effect. The experiment conducted on the 40 nm Virtex-6 FPGA also found that the coefficient of variation and standard deviation for the automatic clock correction scheme were 0.02% and 1.33%, respectively.
Therefore, the effect of the delay on the FPGA was mainly caused by the NBTI degradation effect, which can be translated into a lifetime prediction table that can help system designers and industrial users to replace the FPGA chip at the appropriate time. The data obtained were supported by several FPGA companies, such as Xilinx, Altera, and Lattice Semi. Based on the obtained results, it was shown that each FPGA had a different remaining lifetime due to the percentage utilization of the CLBs. Therefore, the delay retrieved from the FPGA had a performance shrinkage of about 13.89% for each 50 • phase shift difference. This performance shrinkage showed that the effect of the NBTI on the 40 nm Virtex-6 FPGA occurred, although the technology size continues to shrink.