A Scaling Law for SPAD Pixel Miniaturization

The growing demands on compact and high-definition single-photon avalanche diode (SPAD) arrays have motivated researchers to explore pixel miniaturization techniques to achieve sub-10 μm pixels. The scaling of the SPAD pixel size has an impact on key performance metrics, and it is, thereby, critical to conduct a systematic analysis of the underlying tradeoffs in miniaturized SPADs. On the basis of the general assumptions and constraints for layout geometry, we performed an analytical formulation of the scaling laws for the key metrics, such as the fill factor (FF), photon detection probability (PDP), dark count rate (DCR), correlated noise, and power consumption. Numerical calculations for various parameter sets indicated that some of the metrics, such as the DCR and power consumption, were improved by pixel miniaturization, whereas other metrics, such as the FF and PDP, were degraded. Comparison of the theoretically estimated scaling trends with previously published experimental results suggests that the scaling law analysis is in good agreement with practical SPAD devices. Our scaling law analysis could provide a useful tool to conduct a detailed performance comparison between various process, device, and layout configurations, which is essential for pushing the limit of SPAD pixel miniaturization toward sub-2 μm-pitch SPADs.


Introduction
Single-photon avalanche diodes (SPADs) have been widely recognized for having unique features, such as single-photon sensitivity and picosecond timing resolution. In recent decades, SPAD arrays fabricated with the silicon-based complementary-metal-oxidesemiconductor (CMOS) process have been extensively studied for a number of scientific and industrial applications. To explore the emerging applications of SPAD image sensors, researchers have developed large-scale SPAD pixel arrays in compact sensor formats. Continuous efforts in SPAD research and development have led to the exponential growth of the array size and dramatic shrinkage of the pixel dimension; the SPAD array size has reached a milestone of 1 megapixel [1], while a SPAD pixel with 2.2 µm-pitch was reported in test devices [2].
In addition, 3D-stacking approaches have enabled the physically isolation of pixel circuits from the SPAD array while ensuring electrical connection via pixel-level bonding, which provides a promising solution for pixel miniaturization below 10 µm [3][4][5][6]. Such an aggressive miniaturization and scaling of SPAD pixels could have a major influence on the key performance of SPADs, and it is, thereby, critical for designers to understand the fundamental tradeoffs in miniaturized SPADs. Theoretical studies on SPAD performance have been widely performed based on both analytical methods and simulations to describe process, voltage, and temperature dependence [7][8][9]. However, few attempts have been made to systematically analyze the impact of SPAD pixel scaling on major performance metrics, such as the fill factor (FF), photon detection probability (PDP), photon detection efficiency (PDE), dark count rate (DCR), correlated noise, and power consumption.
In this paper, we present an in-depth study of scaling laws in SPADs to clarify the underlying tradeoffs in SPAD design and to give perspectives for the future design of multi-megapixel SPAD arrays. The formulation of scaling laws with regard to the pixel size is performed based on the assumption that the pixel circuit can be located outside of the SPAD array and does not impact the pixel layout. Based on the introduced equations, the scaling behavior of the SPAD performance is exemplified in the plots. The scaling law equations are then compared with previously published experimental results with various SPAD sizes. A good agreement between the theoretical fitting and experimental data validates the scaling law analysis.
The paper comprises four sections. Section 2 presents the theoretical formulation of scaling laws for various performance metrics in the SPAD pixels. Some examples of analyzing experimental data with the theoretical expressions is demonstrated in Section 3, and this is followed by discussion in Section 4.

Analysis Criteria
To proceed with the theoretical analysis of the scaling laws, some assumptions must be made. First, the SPAD pixel array configuration is assumed to be a square grid, while it is not difficult to generalize the discussion to other configurations, e.g., honeycomb structure [10]. Second, circular-shaped SPADs are assumed to simplify the discussion on the curvature change with scaling. In some prior works, rounded-corner rectangle or square SPADs are also adopted to improve the fill factor [11][12][13]. However, these designs are not always suitable for scaling with the geometric similarity preserved, where the electric field concentration at the corners can induce premature edge breakdown and also change the breakdown voltage. Third, a 3D-stacked configuration with a SPAD-only array in a single plane is assumed.
In a non-3D-stacked FSI or BSI configuration, SPAD and pixel circuits coexist in the same plane. In a given pixel pitch, the SPAD and circuit have to share the limited area, and the circuit complexity can affect the size of the SPAD active area and its performance. The main focus of this analysis is to formulate the scaling laws of SPAD performance, and, hence, the SPAD array without circuit components is desired for more systematic and quantitative analysis. Fourth, the active-to-active distance is assumed to be fixed at a certain dimension, irrespective of the scaling parameter. This is justified by the following discussion.
For analysis of the scaling laws in the SPAD pixel, it is natural to assume that the doping profile along with the z-axis for each implantation layer is unchanged, and the breakdown voltage of the p-n junction in the SPAD remains in the same range. This implies that, unlike scaling in MOS transistors, where a lower supply voltage is adopted for the smaller devices, the power supply voltage for the SPAD does not scale as a function of its dimensions. Another premise in SPAD pixel design is that the guard-ring width is sufficiently large to avoid premature edge breakdown. Given the fact that the lateral diffusion length of doped ions cannot readily be controlled, the electrostatic potential distribution around the guard ring is not dependent on the active diameter. The optimum guard-ring width ensuring no edge breakdown in the operating condition is defined by the following equation [14]: where V gr B (W gr ) is the breakdown voltage at the guard ring with the given guard-ring width W gr , V p−n B is the breakdown voltage at the vertical p-n junction, and V max ex is the maximum excess bias used in the system. Based on the discussion above, all the terms in the above equation are not dependent on pixel size, and the optimum W gr can be defined regardless of scaling. These considerations impose a constraint in the pixel scaling that the guard-ring width has to be unscaled and fixed at a certain value over all the SPAD pixel dimensions to guarantee stable Geiger-mode operation without unwanted edge breakdown. The optimum W gr should be comparable to the depletion width of the main SPAD p-n junction, and is typically 1 to 2 µm [15]. In addition, the optimum width of an isolation layer, typically formed with deep-well implantation, is determined by a process design rule for the minimum drawing width, and should not be scaled with the pixel dimensions. The pixel pitch L p , which will be employed as a scaling parameter in the following discussion, can be expressed as: where the well-sharing configuration is assumed, and D a is the active diameter, L a−a is the active-to-active distance, and W iso is the isolation width. In the following discussion, W gr and W iso are both assumed to be 1 µm unless otherwise noted, and L p is assumed to be solely dependent on D a . Figure 1 shows the conceptual views depicting the SPAD pixel scaling. Figure 1a is the example of a top-view layout for a 2 × 2 pixel array. As discussed above, the active-to-active distance L a−a is fixed when shrinking the pixel pitch L p . As a result, the active diameter D a is reduced proportionally to L p . This assumption can be applied to any type of existing SPAD device structures [16,17]. For example, Figure 1b shows the cross-sectional view of p+/NW SPAD. D a is defined as the diameter of the inner circle of the guard-ring p-well, whereas L a−a corresponds to a sum of the NW separation width and twice the width of the p-well guard ring. For PW/deep-NW SPAD or p-i-n SPAD, D a equals the diameter of the p-well, and L a−a is a sum of the NW separation width and twice the width of the virtual p-epi guard ring. This indicates that the scaling law analysis can be performed with only three key dimensional parameters, L p , D a , and L a−a , without losing generality.
In summary, the main assumptions for the analysis of scaling laws are: • a uniform square grid, • a circular shape for the active area and inner/outer borders of the guard ring, • a 3D-stacked configuration with full separation of the SPAD and pixel circuit into different wafers, • an active-to-active distance unscaled with the SPAD pixel dimension, and • the pixel pitch L p employed as a scaling parameter.

Fill Factor
The FF in the SPAD pixel, defined as the ratio between the drawn active area and the pixel area, is one of the fundamental parameters determining the single-photon sensitivity. FF is a purely geometric parameter, and is straightforward to be formulated as a function of the pixel pitch L p : It is clear from the above equation that FF goes down to zero when L p = L a−a and cannot be defined for L p < L a−a . For sufficiently large L p , FF converges to π/4 × 100 = 78.5%. Figure 2 shows the calculated FF as a function of the pixel pitch for several different active-to-active distances. FF curves show monotonic increases with the pixel pitch L p . A relatively steep increase of FF is observed at smaller L p , whereas saturating behavior of FF is shown at larger L p . Slower saturation for larger L a−a indicates that, if the active-toactive distance is large, a larger pixel pitch is required to obtain a higher FF, e.g., above 50%. In the actual sensor design, the effective FF can be enhanced by employing on-chip microlenses [18,19], although designers should bear in mind that microlenses are less effective for smaller f-numbers of the main objective lens.

PDP and PDE
The PDP in SPAD pixels is defined by the following equation [20]: where QE is the quantum efficiency and P ava is the avalanche triggering probability. In ideal SPAD devices, PDP represents the single-photon sensitivity normalized by the active area, and it does not scale with the active diameter and the pixel pitch. In practice, a discrepancy between the "drawn" active area and "effective" active area leads to considerable dependencies of PDP from the scaling parameter L p [21]. The discrepancy between the designed and actual active size stems from two possible reasons: nonideality in the process fabrication and nonideality in the device design. One example of the process nonideality is the lateral diffusion of doped ions [22]. The lateral diffusion length is determined by the type of dopant ions, implantation energy, and thermal annealing conditions and is typically in the order of 0.1 to 1 µm for deep well implantation. This lateral diffusion induces the decrease of the doping concentration at the edge of the active area. The electric field at the edge of the active area can be locally reduced with respect to the electric field at the center of the active area, thus, lowering the sensitivity at the border of the active area.
On the other hand, the device design nonideality is caused by a lateral electric field near the guard ring. Photocharges generated in the neutral region of the SPAD randomly move around due to thermal diffusion until they reach the nearby depletion region and are drifted to an electrode. If the photocharges reach the main p-n junction with a high electric field, they induce avalanche multiplication, thereby generating a photon detection signal. However, photocharges close to the border of the active area can reach the depletion region toward the guard ring before reaching the main junction. In such a case, the carriers do not cause avalanche multiplication, and no photon detection signal is observed. This so-called "border effect" [23,24] causes the photon detection loss at the edge of active area, which becomes more significant in the smaller pixels.
For both process-and device-originated nonidealities, PDP correction can be performed by introducing an inactive radius r in , representing the effective width of the photon-insensitive region at the edge of the active region [25]. The corrected equation for the scaling law of PDP is given by: where PDP max is the virtual maximum PDP with a sufficiently large active size. Figure 3 shows the calculated PDP as a function of the pixel pitch for different r in . The curve with r in = 0 µm corresponding to the ideal case with no border effect shows no dependency with L p . For finite r in , PDP starts from zero at L p = L a−a + 2r in and grows and saturates to PDP max with increasing L p . Similar to the scaling law for the FF, a slower increase is observed for the larger r in .  Figure 3. The calculated PDP as a function of the SPAD pixel pitch L p for PDP max = 50%, active-toactive distance L a−a = 3 µm, and inactive radius r in = 0, 0.25, 0.5, and 1 µm.
PDE is another indicator of single-photon sensitivity. Unlike PDP, where the sensitivity is normalized by the active area, PDE is defined as the single-photon sensitivity normalized to the pixel area. The following equation holds [20]: Based on the previous equations, PDE can be explicitly formulated as: Figure 4 is the calculated PDE as a function of L p for different r in . Similar to FF and PDP, the curves start from zero at smaller L p and saturate at larger L p . The maximum PDE is given by PDP max × 78.5% = 39.3%, assuming PDP max = 50%. Again, introducing on-chip microlenses will potentially increase the overall PDE. Figure 4. The calculated PDE as a function of the SPAD pixel pitch L p for PDP max = 50%, active-toactive distance L a−a = 3 µm, and inactive radius r in = 0, 0.25, 0.5, and 1 µm.

DCR
DCR has several different causes, such as band-to-band tunneling, trap-assisted tunneling, trap-assisted thermal generation, and the diffusion current [26,27]. Experimentally, the source of the DCR can be classified based on an Arrhenius plot [28][29][30]. In silicon SPADs, the activation energies E a for band-to-band tunneling, trap-assisted tunneling, trap-assisted thermal generation, and the diffusion current are known to be approximately 0, 0-0.55, 0.55, and 1.1 eV, respectively. In practice, the measured E a can have intermediate values, e.g., 0.8 eV, indicating a mixture of multiple DCR components.
Based on the assumption that premature edge breakdown is suppressed, the tunneling components at the edge of the active region can be neglected. Contributions of the thermal generation and diffusion current are also negligible in the depletion region to the guard ring due to an insufficient electric field for avalanche triggering by the generated carriers. Therefore, the contribution from the main p-n junction of the SPAD dominates over that from the edge of the active region. Interestingly, all the aforementioned DCR components are proportional to the "effective" active area.
The tunneling current, regardless of being band-to-band or trap-assisted, is proportional to the total volume of the region with a highly concentrated electric field, which is clearly proportional to the active area. Thermal generation and diffusion carriers are detected only when those carriers are generated in the vicinity of the active region. Assuming that thermal generation and the diffusion current are spatially uniform around the active region, those components are also naturally assumed to be proportional to the active area. The scaling law for DCR can be formulated as follows: where R 0 is the DCR per unit of active area. Figure 5 is the calculated DCR as a function of L p for different DCRs per unit area R 0 . Starting from 0 cps at L p = L a−a + 2r in , the DCR shows a parabolic increase with L p . the DCR is highly dependent on R 0 , which is a function of the excess bias, temperature, and process quality, such as the trap and impurity densities. Opposite to the FF, PDP, and PDE, a smaller pixel pitch is desirable to improve DCR performance. The designer should consider the best tradeoff between PDE and DCR to find the optimum L p to, thus, provide a reasonable S/N ratio. The DCR density R is defined by the DCR normalized by the drawn active area and is often used for comparison of the SPAD process quality between devices fabricated in different processes [31]. As with PDP, nonideality, such as for the border effect, leads to the dependence of R on L p as follows: At larger L p , the DCR density saturates to R 0 . Figure 6 shows the L p dependence of the DCR density for various R 0 . As can be seen from the similarity to the equation for PDP, the DCR density starts from zero at L p = L a−a + 2r in and rapidly increases and saturates for larger L p . This implies that, in the actual measurement, the DCR density can be underestimated at the smaller pixel pitch due to the existence of the photon-insensitive region at the edge of the active region.  Figure 6. The calculated DCR density as a function of the SPAD pixel pitch L p for R 0 = 0.2, 0.5, 1, and 2 cps/µm 2 , active-to-active distance L a−a = 3 µm, and inactive radius r in = 0.5 µm.
Note that the above discussion is based on the assumption that the guard-ring width is optimized to avoid edge breakdown for the entire range of the pixel pitch. In the actual device design, sometimes an abrupt increase of DCR and DCR density is observed at a smaller pixel pitch even with fixed active-to-active distance. To the best of our knowledge, no systematic analysis has been conducted for this phenomenon. One possible reason is the enhanced curvature at the edge of the active region inducing a high electric field near the guard ring. Analogously to antennas, the electric field tends to increase in regions of high curvature, which may induce premature edge breakdown when scaling down the pixel. Another possible explanation is the nonideality in the photoresist formation process.
In most SPAD devices, the diffusion regions for the p-n junction, guard ring, or isolation are formed by well doping where high energy doping is employed. In such a process, a thicker photoresist is desired to avoid penetration of the accelerated ions through the resist. The opening size of the photoresist for such a thick resist (typically 3 to 10 µm) requires careful calibration to match the actual shape and size to the designed layout. The layout for well doping is usually supported only for 0 or 90 degree lines, whereas a SPAD layout often involves a circular or ring shape with arbitrary angles. This could cause the deviation of the actual resist opening size from the design especially in the smaller pixel dimension, leading to unwanted edge breakdown.

Afterpulsing Probability
Correlated noise, such as afterpulsing and crosstalk, is critical for certain applications where the temporal and spatial correlations of photon detection signals play key roles [32][33][34]. Afterpulsing is caused by an avalanche-generated carrier captured at a deep trap state near the multiplication region, which is released by thermal activation or tunneling after a nanosecond to microsecond trapping time, thus, inducing another avalanche multiplication event. This mechanism implies that the afterpulsing probability P a is dependent on the trap density D trap and the total number of avalanche-generated carriers N ava . A higher trap density and more avalanche carriers result in a higher P a . If P a is not overly large, e.g., smaller than 10%, a linear relation between P a and D trap × N ava can be assumed to a first-order approximation [35].
Assuming the spatially uniform distribution of the deep trap states, D trap is independent of the scaling parameter. N ava , on the other hand, can be dependent on the scaling parameter. N ava is calculated based on the following: where e is the elementary charge; V ex is the excess bias; C par is the total parasitic capacitance at the SPAD output node, either cathode or anode, which is connected to the quenching resistor; C p−n is the p-n junction capacitance at the active region; and C 0 is the sum of the other parasitic capacitance contributions from connected metal wires, diffusion regions, gates, etc. C p−n is proportional to the active area, whereas C 0 does not scale with the pixel size or the active size. In summary, the scaling law of P a is given by: where A is the temperature-, bias-, and process-dependent coefficient; is the permittivity; and W e f f is the effective depletion region width determined by the p-n junction doping profile. Figure 7 shows the L p dependence of the afterpulsing probability for various W e f f and C 0 (dashed lines for C 0 = 5 fF, and solid lines for C 0 = 30 fF). For all parameter combinations, the parabolic increase of P a is shown with the offset corresponding to A × C 0 . A larger W e f f shows a weaker dependence of P a on L p , indicating less contribution of the p-n junction capacitance to the total parasitic capacitance C par . In any case, scaling down of the pixel has a positive impact on the afterpulsing probability due to the reduced parasitic capacitance.
Note that the dead time is assumed to be constant for all L p in this analysis. In a real device design, fixed quenching resistance results in the L p dependence of the dead time. This secondary effect makes the P a less sensitive to L p compared to the case where constant dead time is assumed. If the dependence of P a on the dead time is strong enough to compensate for the trend as shown in Figure 7 then it will be possible to flatten or even reverse P a for larger L p . and W e f f = 0.5, 1, 2 µm.

Crosstalk Probability
Crosstalk is another type of correlated noise in SPAD pixels. Unlike afterpulsing, where only a single pixel is involved, crosstalk involves two or more pixels. When avalanche multiplication is triggered in a pixel, thousands to millions of electrons and holes are generated. When those carriers are recombined with counterpart charges, either photons or phonons can be emitted to preserve the energy conservation law. Silicon is a material with an indirect bandgap, and hence the probability to emit photons is very low. For photon energy higher than the silicon bandgap, only several to tens of photons are emitted out of the one million avalanche-generated carriers [36,37]. However, those photons can move toward a neighboring pixel and be detected.
Similar to afterpulsing, the crosstalk probability P c is dependent on the number of avalanche-generated carriers N ava . A larger number of carriers leads to a higher P c . Again, to a first-order approximation, P c is considered to be proportional to N ava . In addition, the distance between pixels is another important factor for scaling. Given that the emitted secondary photons decay exponentially with the travel length, a shorter pixel-to-pixel distance could result in higher crosstalk. The emitter-to-receiver distance dependence of crosstalk can be approximated by [38]: where B is a coefficient that will be explained later, r is the distance from one SPAD of interest to the other, and α is the effective decay length of the emitted light. Regarding the crosstalk between two nearest-neighbor SPAD pixels, r in the above equation corresponds to the pixel pitch L p . Note that this equation implicitly assumes that the light emission occurs at the center of the active region for the emitting SPAD, and the average photon intensity reaching the active region of receiver is approximated by the photon intensity at the center of the active region of the receiving SPAD. In reality, the finite size of the active region for both the emitter and receiver may cause a slight deviation of the measured crosstalk from the above model. For simplicity, the following analysis will be based on the above model where the effect of the finite active size is neglected. The coefficient B is dependent on both the emitter and receiver characteristics. Considering the emitter, B should depend on the total number of emitted photons, which is proportional to N ava . On the other hand, B should also be correlated with the sensitivity of the receiver. The probability of detecting an emitted photon is proportional to the PDP and the active area, which coincide with the PDE times L 2 p by definition. Thus, the crosstalk probability between two nearest-neighbor SPAD pixels can be expressed as: where B is an excess-bias dependent coefficient. Figure 8 shows the L p dependence of the calculated crosstalk probability for various α and C 0 . All curves show increasing trends for L p close to L a−a + 2r in . For larger L p , either increasing or decreasing trends are observed, depending on the parameter set. The curve with α = 0.2, 0.1 µm −1 , and C 0 = 30 fF shows reduction toward zero, whereas the curve with α = 0.05 µm −1 and C 0 = 5 fF shows a monotonical increase. Note that α = 0.2 µm −1 , and 0.05 µm −1 correspond to the cases with effective light emission wavelengths of 700 and 850 nm, respectively. In contrast to afterpulsing, crosstalk probability does not necessarily show monotinic dependence on L p ; the impact of pixel miniaturization is highly dependent on the combination of model parameters.
To suppress crosstalk, several countermeasures can be considered. First, lowering V ex helps to suppress the crosstalk probability at the expense of the PDP and PDE. V ex affects both N ava in the emitter and the sensitivity of the receiver, and hence the crosstalk probability follows the square law with respect to V ex . Second, the formation of opaque deep trench isolation (DTI) could suppress the crosstalk. Trench materials with a lower refractive index can reflect the emitted photons and eventually confine the photons in the emitter. This could lead to an order of magnitude improvement of the crosstalk probability.

Power Consumption
The avalanche-originated power consumption in large-scale SPAD arrays is a key parameter as it grows proportionally to the number of pixels. The total power consumption in a SPAD array depends on the incident photon flux. For a systematic comparison, the following discussion focuses on the energy consumption per single avalanche event, E ava , in a single SPAD pixel. The power consumption at the readout circuits is not taken into account here. E ava is a product of eN ava and (V ex + V B ), expressed as follows: where D = V ex × (V ex + V B ) is the bias-dependent coefficient and V B is the breakdown voltage of the SPAD. Apart from the details of the coefficient, the equation has the same structure as that of the afterpulsing probability. Naturally, the calculated trend of the single-event power consumption E ava as a function of L p in Figure 9 shows similarity to Figure 7. and W e f f = 0.5, 1, and 2 µm.

Timing Jitter
The timing jitter in the SPAD is determined by multiple factors, such as the device configuration, doping profile, detection threshold, excess bias, and temperature, and it is not straightforward to formulate the scaling law for this. Qualitatively, a larger pixel pitch produces a higher timing jitter for several reasons: first, the spatial expansion of the avalanche multiplication process takes more time in the larger L p due to the finite lateral avalanche propagation velocity [39]. Second, a larger L p requires slower rising of the output voltage due to the larger parasitic capacitance, leading to enhanced statistical variability. Further systematic analysis should be conducted for deeper understanding of scaling the timing jitter.

Summary of Scaling Law Analysis
In the above sections, the scaling laws of the key SPAD characteristics with pixel dimensions were investigated. Miniaturization of the SPAD pixel improves the DCR, afterpulsing, power consumption, and timing jitter, whereas it has an adverse effect on the fill factor, PDP, and PDE. The equations for the scaling laws are summarized in Table 1. In particular, the degradation of the single-photon sensitivity is inevitable in the conventional SPAD pixel when its pitch becomes smaller than 10 µm. Further technological breakthroughs are required for SPAD pixel miniaturization toward multi-megapixel arrays. Table 1. Summary of the scaling laws in the SPAD pixels with the pixel pitch L p as a scaling parameter. The coefficient is omitted in the equations.

Extraction of Model Parameters
To demonstrate the applicability of the scaling law analysis to practical situations, we performed a theoretical fitting with experimental results from the literature. Figure 10 shows experimental data from the literature [40] representing the pixel size dependence of the maximum PDP (shown as dots). Here, L a−a is assumed to be 8 µm. The experimental data was fitted using the scaling law equation for PDP (shown as a dashed line). The fitted curve shows a good agreement with the measurement when the fitting parameters are PDP max = 22.8% and r in = 1.06 µm.
The extracted fitting parameters indicate that the maximum PDP reaches 22.8% for larger pixels with this device configuration and bias condition, whereas the effective photoninsensitive region with the width of 1.06 µm reduces the maximum PDP for smaller pixels. The fitting result implies that the PDP will go down to zero at L p = L a−a + 2 × r in = 10.12 µm, and thus the pixel pitch with this SPAD device configuration cannot go below 10 µm unless the process conditions and design rules are modified. Note that r in is determined by the spatial distributions of both the electrostatic potential and photon absorption rate. Given that the latter distribution is a function of the wavelength of incident photons, r in can potentially be dependent on the wavelength of interest. In Figure 11, a similar analysis is performed for the measured DCR from the literature. Again, the fitting result shows good agreement with the measurement. The corresponding fitting parameters are extracted as R 0 = 8.50 cps/µm 2 , and r in = 0.030 µm. An interesting implication is that the extracted r in for the DCR is different from that for the maximum PDP. This can be interpreted similarly to the previous remark on the wavelength dependence of r in ; the spatial distribution of the photon absorption rate for PDP can be different from that of the thermal generation rate for DCR, thereby, representing a different inactive radius r in . The extracted r in for the DCR could provide useful information to estimate the major source of the DCR. Figure 11. The measured DCR trend from the literature fitted by the theoretical equation [40]. The measured and fitted data are shown as dots and a dashed line, respectively.

Discussions
We investigated the theoretical expressions of the scaling laws for the major performance metrics in SPAD pixels. The analysis showed that SPAD pixel miniaturization improved the DCR, afterpulsing, power consumption, and timing jitter, whereas it had an adverse effect on the FF, PDP, and PDE. The scaling law equations for PDP and DCR were then applied to the experimental data in the literature, showing good agreement with the measured trends. The extracted fitting parameters were used to extrapolate the expected pixel size dependence of PDP and DCR, which implied that a pixel size smaller than 10 µm cannot be achieved without modification of the current process conditions and design rules.
Our scaling law analysis has three potential applications: the prediction of SPAD performance based on existing measurement data, extraction of model parameters to quantify the pixel-size-independent metrics, and systematic comparison of SPAD performance tradeoffs for different process, device, and layout configurations. The first approach can be useful for designers to understand the underlying tradeoffs and decide the optimal pixel size for applications of interest. The second approach can be critical to understanding the limiting factors of SPAD performance.
In-depth study of the extracted model parameters provides rich information of SPAD pixels, such as the inactive radius, parasitic capacitance, effective depletion width, and effective decay length of the avalanche-induced photons, which cannot be directly measured with existing measurement techniques. The third approach can be employed for clarifying the pros and cons of one SPAD device configuration to the other, which is essential for the correct choice of process conditions and device structure. Combining these approaches will provide a promising tool for further pushing the limit of SPAD pixel miniaturization toward sub-2 µm-pitch SPADs.
The extracted models are focused on pixel size dependence. Further generalization of the models to fully account for the voltage and temperature dependence of the metrics remains to be verified in future work.

Abbreviations
The following abbreviations are used in this manuscript: