3.1. Top-Level Diagram of Temperature Correction Module
A diagram of the temperature correction module is shown in
Figure 6.
Since temperature variations tend to occur very slowly, it is not necessary to continuously monitor the temperature. The timer is used for periodic measurement of the delay chain temperatures using the thermometric clock sorting module. After each measurement, the absolute value of the difference between the edge values of the thermometric cells is fed to the TC start module for analysis of the current temperatures. The analysis process is as follows. If there is no temperature change trend identified, the TDC will remain in normal operation mode. If a temperature change trend is identified, the module will check if the change is larger than the given threshold. If so, the TDC switches to temperature correction mode, i.e., temperature correction will be launched, and the time measurement process will be suspended. Otherwise, the TDC will remain in normal operation mode.
Once the temperature correction starts, the edge values in LUT will be individually corrected by the LUT correction module. This disturbs the original order of the edge values. Therefore, a LUT reorder module is introduced to reorder the edge values in the LUT after each correction. Once the reorder process is completed, and the entire temperature correction process is finished, the TDC switches back to the normal operation mode.
The temperature correction module is performed in real-time and online.
3.2. Clock Sorting Operation for Temperature Measurement
This paper proposed a simple method to monitor the temperature of delay chains. Two thermometric registers are deployed at both ends of the delay chains, and a thermometric clock sorting operation is carried out periodically to obtain the edge values of the thermometric cells. The variation in edge values is calculated and fed into the TC start module to determine whether temperature correction should be performed.
The thermometric clock sorting operation is based on the code density test [
13]. Two signals are used during this process—the reference signal C
0 and the signal to be measured C
X—which is the thermometric cell clock. The frequency of the resort signal is 6.7821 MHz, which is not related to the system clock, which is 400 MHz. Thus, the probability for each cell to be sampled is the same. A large number of random samples are performed to evenly distribute the resort signals at various positions across the whole clock cycle. The proportion of measurements is calculated between C
0 and C
X for each of the four types “10”, “11”, “01”, and “00”. The results are multiplied by the clock cycle to obtain the edge values of C
X.
There are always two possible relationships between C
0 and C
X. If the ratio between the four cases is 0.2, 0.2, 0.3, and 0.3, the rising and falling edge positions will be restricted to the two cases shown in
Figure 7.
For the initialization clock sorting operation, CAL_1 is used to obtain two possible relationships between the reference clock C0 and any other clock CX. The role of CAL_2 is to distinguish between these two cases and is not required in temperature correction mode. By transmitting the results of clock sorting operation to the host computer for analysis, the phase relationship between each thermometric cell’s signal CX and the reference signal C0 can be confirmed clearly.
It should be noted that the thermometric cells at both ends of the delay chains referred to in this paper do not represent the first and last cells in the absolute sense, but rather the cells at the start and end positions of the delay chain. In order to save computing resources and time of the whole correction process, the cell with a number that is a power of 2 can be considered as the thermometric cell in priority. For example, the 64th cell can be selected if there are 95 cells in each delay chain.
Figure 8 shows the clock signal of the 64th cell at TDC temperatures of 10 °C and 70 °C. As the temperature increases from 10 °C and 70 °C, the value of the rising edge changes from 363 ps to 574 ps, i.e., a 211 ps edge increment occurs, which accounts for a delay of 8.44% of the total clock cycle of 2500 ps.
This variation in edge values due to changes in temperature is small relative to the overall clock cycle. Therefore, in contrast with the thermometric clock sorting operation module, it is not necessary to count the proportions of each of the four cases, as counting a single case will be sufficient. This greatly reduces programming complexity and saves FPGA resources. For example, to calculate the variation in the rising edge of the 64th cell, we only need to count the frequency of “10” cases. The frequency of each case is divided by the total number and multiplied by the system clock cycle to get the edge values at the current temperature. After the initialization clock sorting operation is completed, the thermometric clock sorting operation starts after one timer period. For any specific temperature, the variation in edge value can be represented by Equation (1) as:
where
n is the thermometric cell number,
is the rising/falling edge value when the temperature varies, and
is the original rising/falling edge value.
A positive value of indicates that the current temperature of the delay chain is higher than the original temperature and the temperature rise mark bit is asserted. A negative value of indicates that the current temperature of the delay chain is lower than the original temperature and the temperature drop mark bit is asserted. Both and the mark bit are sent to the TC start module to determine the temperature trend.
3.3. Temperature Correction Start Module
The TC start module operates as follows. It firstly assesses whether the current temperature of the delay chain has increased, decreased, or is substantially unchanged compared to the previous temperature. If there is a clear and strong temperature trend, temperature correction is launched. Otherwise, TDC remains in normal operation mode.
For accurate temperature trend measurement, the temperature measurement frequency, which is controlled by a timer, should be set to a suitable frequency. However, a temperature measurement frequency that is too high will increase the randomness of the temperature measurement, which may increase the probability of incorrectly identifying a temperature trend. Therefore, strict trend judgment criteria are required. In this paper, two counters, named rise_cnt and fall_cnt, were used to determine the temperature trend. Both counters started at zero and were incremented each time its corresponding temperature mark bit was asserted. The difference and sum of rise_cnt and fall_cnt were both calculated. Before the sum of the two counters reached a pre-defined value, if the difference between the two counters was greater than half this value, a temperature trend was identified with the larger counter, indicating whether the temperature was rising or falling.
Once it is evident that there is a temperature trend, a threshold judgment is performed. If Equation (2) is true, the temperature correction process is launched, and the LUT correction module and the LUT reorder module are sequentially triggered to correct and sort edge values and store the new values in the LUT.
where
and
are the variations of the edge values of the thermometric cells matched with the thermometric registers, and
is a pre-defined temperature correction threshold.
Note that once temperature correction is launched, the LUT used for the decoding process must be updated, so time measurement cannot be performed during the temperature correction process and must wait until it has been completed. The introduction of a threshold prevents the TDC from performing unnecessary temperature corrections for temperature fluctuations within a small range. The threshold is a parameter that should be determined by executing the clock sorting operation repeatedly while the ambient temperature varies and can be set according to the FPGA temperature adaptability and the working environment.
3.4. LUT Correction Module
Since the delay chain has a cumulative effect on the delay of the input clock signal, the delay effect of temperature on the delay chain also has a cumulative effect, i.e., cells at the back of the delay chain will occur larger edge value variation than the cells at the front of the delay chain. The main task of the LUT correction module is to correct the edge values according to the position of the cells in the delay chain.
For example, for the first sub-delay chain, the correction formula will be as follows:
where
is the number of the
nth cell of the delay chain, which is equal to
n,
and
are the numbers of the thermometric cells,
is the current edge value, and
is the edge value after correction. Each cell has a rising and falling edge value, so there will be twice as many edge values as cells, i.e.,
i is equal to 2
n or 2
n − 1. The effect of temperature on the delay chain considers each cell as a minimum element, so the same correction parameters are used to correct both types of edges for the same cell. The other three sub-delay chains have a similar correction process to the first sub-delay chain. Each cell is corrected using the same correction parameters for that cell position, e.g., the edge values of the last cell of the second sub-delay chain are corrected using the same correction parameters as the last cell of the first sub-delay chain.
Because of the cross-clock cycle phenomenon,
need to be checked after corrected by Formula (3). If
is larger than the system clock cycle, the real
is equal to its value less than the system clock cycle. If
is smaller than 0, the real
is equal to its value plus the system clock cycle. The cross-clock cycle phenomenon has been discussed further in
Section 3.5.
In addition, to avoid repeatedly triggering temperature corrections at the same temperature, the parameter in Formula (1) should be updated by reading the corrected LUT after each LUT correction.
3.5. LUT Reorder Module
After executing the initialization clock sorting operation and obtaining the edge values of the clock signal corresponding to each cell, the edge values are sorted and stored for each cell number in LUT. Once the length of the delay chain reaches a certain value, the delay to the input clock signal will be large enough to result in a cross-clock cycle phenomenon. Therefore, there may be multiple cells with similar phases in the delay chain, as shown in
Figure 9.
As shown in
Figure 9a, in its original state, the rising edge value of C
N is slightly smaller than C
2. Therefore, the rising edge value of C
N must be placed before the rising edge value of C
2 during the sorting process, before being stored in the LUT. When the temperature rises, the edge increment of the delay chain’s back end cells must be larger than the edge increment of the front end cells due to the cumulative effect of the delay. Therefore, after performing edge correction, the rising edge value of C
N will be larger than the rising edge value of C
2. In contrast, as shown in
Figure 9b, if the edge value is very close to the clock cycle, then as the temperature rises, the cross-clock cycle phenomenon will occur. (Note the situation when the temperature drop is similar but has not been described here). As a result, the order of the edge values in the original LUT will be disturbed. Therefore, the LUT reorder module should be introduced.
The FPGA sort operation is more complicated than a software sort, as the use of arrays requires a high logic resource consumption. Additionally, the design platform may get trapped in the analysis and synthesis process due to the use of the large array and the loop structure.
Therefore, pairwise comparison is adopted to construct the reorder module. Each time two edge values are read from the LUT, the two adjacent edge values are compared, and the larger value is retained. Once the pairwise comparison is completed, the larger value is saved. The new and the old larger values are further compared, and the larger value is retained as the alternative to the biggest edge value for the current round of comparison. The alternative biggest edge value is constantly updated for each pairwise comparison. When all the edge values in LUT have been compared, the alternative biggest edge value (the biggest edge value in the current round of comparison) is placed in the storage address of the last read data value. In order to prevent data loss, the data originally at this address is stored in the address of the biggest edge value. In summary, each round of comparison will identify the biggest edge value for that round and store that value and its cell number at the end of the LUT. The edge value and cell number, which were previously stored at this position, will be stored in the original address of the biggest edge value.
In order to illustrate this process clearly, information on the first 10 edges was extracted from a real LUT so that the comparison process could be explained in detail. The edge value units were picosecond.
The first round of comparison is shown in
Table 1. The cell number 346 and its edge value 92 were retained in the 10th memory address. The edge values stored in the first 9 memory addresses would be sorted by the second round of comparison. The updated LUT for the second round of comparison is shown in
Table 2.
For the second round of comparison, the even memory address (10th) data was missing for the fifth data comparison. Therefore, the edge value of the last even memory address (8th) was read instead and compared with the odd memory address (9th). After the second round of comparison, the cell number 259 and its edge value 53, which were stored in the 4th memory address, were used to replace the cell number 291 and its edge value 36 stored in the 9th memory address. The third round of sorting is shown in
Table 3, which would sort the edge values stored in the first 8 memory addresses.
Based on this process, the sorting process will not be repeated for each memory location. If there are m edge values, then m − 1 rounds of comparison are required. Once the reorder process is completed, the overall temperature correction process is finished, and the TDC switches back to the normal operation mode.