Bayesian Methods for Inferring Missing Data in the BATSE Catalog of Short Gamma-Ray Bursts

: The knowledge of the redshifts of Short-duration Gamma-Ray Bursts (SGRBs) is essential for constraining their cosmic rates and thereby the rates of related astrophysical phenomena, particularly Gravitational Wave Radiation (GWR) events. Many of the events detected by gamma-ray observatories (e.g., BATSE, Fermi, and Swift) lack experimentally measured redshifts. To remedy this, we present and discuss a generic data-driven probabilistic modeling framework to infer the unknown redshifts of SGRBs in the BATSE catalog. We further explain how the proposed probabilistic modeling technique can be applied to newer catalogs of SGRBs and other astronomical surveys to infer the missing data in the catalogs.


Introduction
The discovery of the first Gravitational Wave Radiation (GWR) event in 2015 [1,2] and the first joint detection of gravitational and electromagnetic radiation from a binary neutron star merger in 2017 [3] have revolutionized multimessenger astronomy and resulted in the 2017 Physics Nobel Prize. These observations have given credibility to the hypothesis of binary neutron star (BNS) mergers as the primary progenitors of Short Gamma-Ray Bursts (SGRBs). Other channels of SGRB formation include neutron stars-black hole or binary black hole mergers. Unlike Long Gamma-Ray Bursts (LGRBs), which are believed to be due to the collapse of supermassive stars and are frequently associated with core-collapse supernovae [4], the light curves of SGRBs are comprised of short-hard intense gamma-ray pulses that generally last a few milliseconds to seconds.
A question of great interest in the field of GWR astronomy concerns the cosmic rates of GWRs and the fraction of detectable events with the current and planned GWR detectors [5].
LGRBs and SGRBs are among the major sources of GWR events. Therefore, knowledge of the cosmic rates of GRBs can yield tight constraints on the cosmic rates of GWR events. The challenge, however, lies in quantifying the population properties and the cosmic rates of GRBs. Unraveling the physics and the intrinsic properties of GRBs requires knowledge of their distances from Earth represented by the quantity known as the cosmological redshift (z). Despite significant progress made over the past two decades, most events detected by the existing gamma-ray observatories lack measured redshifts. For example, less than 4% of the entire catalog of >3000 GRBs detected by the Gamma-ray Burst Monitor (GBM) onboard the Fermi Gamma-ray Space Telescope have measured redshifts, and even fewer of those with measured redshifts belong to the class of SGRBs, a primary GWR formation channel. While there seems to exist a consensus on the range of the redshift distributions of LGRBs and SGRBs [6][7][8][9], the shapes of the distributions and the knowledge of the redshifts of individual events remain hotly debated [9] and elusive. Previous studies have already utilized the phenomenologically discovered prompt gamma-ray correlations to derive pseudo-redshifts for a certain fraction of events in the second-largest catalog of GRBs available to this date, homogeneously detected by the BATSE Large Area Detectors onboard the now-defunct Compton Gamma-Ray Observatory [10][11][12][13]. More recent studies have applied similar concepts and ideas to the problem of estimating the unknown redshifts of GRBs in the Fermi-GBM catalog [14,15].
These methods, however, can lead to highly biased estimates of the unknown redshifts of GRBs where the discovered high-energy correlations result from a small calibration sample. The calibration samples are typically the brightest detected GRBs whose redshift measurements have been feasible. Such small samples are often collected from multiple heterogeneous surveys and potentially neither represent the unobserved intrinsic cosmic population of GRBs nor the complete observed sample (with or without measured redshifts). More importantly, the potential effects of the detector threshold and sample incompleteness on the proposed phenomenological correlations are poorly understood. These unknown effects and systematic biases manifest themselves in predicted redshift values that are highly inconsistent with the estimates obtained from other independent methods; examples of which are well studied in the literature [16][17][18][19].
In this article, we propose and lay out the details of a data-driven approach to reconstructing the missing redshift and other information in GRB catalogs. The proposed methodology is generic, is applicable to any astronomical or other type of datasets, and opens the venue toward further quantification of the answers to some of the major questions in GRB research, including but not limited to: What are the cosmic rates of Short-and Long-duration GRBs? Can the observational properties of GRBs be used as cosmological standard candles? Is there a reliable indirect method of inferring the redshifts of GRBs in real time? Are there any deviations in the cosmic rates of LGRBs from the Star Formation Rate? To explain the proposed probabilistic framework, we particularly focus on the BATSE catalog of 565 SGRBs due to the simplicity of the BATSE triggering mechanism. Despite the extremely large uncertainties in the BATSE SGRBs' observational data, we show the feasibility of setting reasonable constraints on the individual redshifts of most BATSE SGRBs. Although we frequently refer to the Fermi-GBM and Swift-BAT catalogs of GRBs in discussions, we leave a complete treatment of these catalogs to future studies. This study is a continuation of a previous similar study of the BATSE LGRBs catalog [1].

Methodology
Our proposed approach to reconstructing the missing data in GRB catalogs comprises three major steps: Firstly, describe the probabilistic foundations of our proposed Bayesian approach to inferring missing data from the available observational catalog(s) combined with any prior knowledge from independent resources. Such techniques should naturally incorporate all sources of uncertainty into the analysis.
Secondly, we develop a detailed minimally biased model of the detection process of GRBs. Such careful mathematical modeling of gamma-ray photon detectors is crucial for an accurate study of the population properties of GRBs and the estimation of their cosmic rates.
Lastly, we introduce a probabilistic framework for calibrating, validating, and selecting the most plausible model for the population properties of the prompt gamma-ray emission of GRBs, which can be subsequently used to infer the unknown missing data in GRB catalogs, most importantly, the unknown redshifts of the individual GRBs. A more detailed explanation of this process is given in Sections 2.1-2.3.

Development of the Statistical Techniques
The essence of the proposed approach to estimating the missing data, specifically the unknown redshifts of GRBs, is summarized in Figure 1. For illustration, consider a toy model where the observed properties of one GRB class (e.g., SGRBs) are collectively represented by the single blue-colored lines in this figure. Each blue line represents one GRB event. If we knew the redshifts of these individual events, represented by the black lines in the middle subplot, we would be able to determine, with high precision, the intrinsic properties of all detected GRBs, represented by the red lines in the top subplot. The background shaded areas in the top and middle subplots represent the generating distributions of the corresponding lines in the foreground. Two approaches to inferring the cosmic rates of GRBs and individual redshifts can be taken, depending on the state of knowledge of individual GRB redshifts in a given GRB catalog.  In the presence of some GRBs with redshifts, as is the case with the Fermi Gamma-ray Burst Monitor (GBM) [20] and Swift catalogs [21], we can use this partial knowledge of individual redshifts to infer the overall cosmic rates of GRBs (the gray distribution in Figure 1), the intrinsic population distributions of GRBs (the red distribution), and the posterior predictive distributions of the unknown redshifts of individual catalog GRBs (the black lines). This approach is particularly feasible for LGRB catalogs with a nonnegligible number of events with measured redshifts. For example, there are currently >120 LGRBs with redshifts and >2350 LGRBs without redshifts in the Fermi catalog. Previous modeling attempts [8] based on only 120 Swift LGRBs with known redshifts and 205

Intrinsic GRB properties
LGRBs without redshifts have successfully shed more light on the intrinsic cosmic rates of LGRBs. Therefore, one expects the order-of-magnitude-larger sample size of the Fermi catalog to only lead to significantly more robust LGRB rate inferences and more stringent comparison with the existing models of Star Formation Rate (SFR).
LGRBs with missing redshifts can be readily incorporated into such analysis via Bayesian marginalization [22].

Modeling Scenario 2
In the absence of any redshift knowledge, as is the case with the BATSE catalog [1,23] and most SGRB catalogs, we can still adopt a multilevel Empirical Bayesian methodology to estimate both the redshifts and the intrinsic properties of individual GRBs probabilistically. This novel approach may resemble magic as it enables us to infer two unknowns (the red distribution and the black lines in the subplots of Figure 1) from a single known quantity (the blue-colored lines). However, the power of the method stems from our ability to include an independent prior knowledge of the overall redshift distribution of GRBs in the analysis. This prior knowledge is represented by the gray distribution in the middle subplot of the figure. It can be chosen to be any plausible intrinsic GRB cosmic rate scenario. For LGRBs, the prior could be set to the most recent SFR models in the literature [24,25], or the hotly debated LGRB cosmic rate models that deviate from the SFR at low redshifts [9,[26][27][28] or at high redshifts [8,29]. In the case of the Fermi catalog, the available 120 spectroscopically and photometrically measured redshifts can be compared with the corresponding predicted redshifts using this Empirical Bayesian approach. Finally, quantifying and tabulating the results of this comparison via simple linear correlation measures for different prior models enables us to identify the most plausible cosmic rate model for GRBs.

Modeling the Population and the Cosmic Rates of SGRBs
Except for a handful, most SGRBs in the BATSE, Fermi, Swift, and other SGRB catalogs lack redshifts. Therefore, the modeling approach of scenario 1 will lead to degenerate solutions for the rates of SGRBs due to data scarcity. The modeling approach of scenario 2 is nevertheless identically applicable to SGRBs. There is now strong observational and theoretical evidence for binary neutron star mergers as the primary progenitors of SGRBs. It is, therefore, reasonable to assume that the cosmic rate of SGRBs follows that of LGRBs or SFRs, convolved with appropriate merger delay time distributions. Population synthesis simulations [30][31][32][33][34] already provide theoretical estimates of merger delay time distributions independently of observational data. This is the only additional complexity in modeling the rates and the population properties of SGRBs compared to LGRBs.

Unbiasedness and Consistency Checks
It is reasonable to question the unbiasedness of the observed redshift distributions of various GRB catalogs used in scenario 1 compared to the underlying intrinsic GRB redshift distribution. While quantifying the potential unknown biases in the spectroscopic and photometric redshift measurements of GRBs is highly challenging, a simple consistency check can be employed to ensure the absence of bias in the observed redshift distribution of GRBs. This can be accomplished by performing the modeling of scenario 1 twice, first only for the sample of GRBs with redshift and the second time by including the entire catalog of GRBs with and without redshift. The absence of any significant difference in the results for the two datasets will be strong evidence against the presence of any significant biases in the observed redshift distribution of GRBs. Notably, independent studies of this matter based on the Swift catalog LGRBs have not led to any noticeable differences between the redshift distributions of LGRBs with and without redshifts [8,35]. Furthermore, the sound rules of probability theory require a quantitative consistency of the results obtained from the two independent modeling scenarios 1 and 2. Any significant deviations between the results from the two scenarios either imply potential parameter degeneracy of the model of scenario 1 or wrong choices of redshift priors in scenario 2. This purely probabilistic framework provides a robust self-consistent approach to model verification and validation in this data-driven framework. Although the handful of SGRBs with redshifts in GRB catalogs does not help infer their intrinsic cosmic rates, they provide a valuable benchmark against which the calibrated SGRB models could be validated.

Modeling of Data and the Gamma-Ray Detector Thresholds
The practical implementation of the proposed mathematical methodology in Section 2.1 can be qualitatively understood by introducing the four main attributes with which the prompt gamma-ray emissions of GRBs are characterized. These spectral and temporal prompt emission attributes are illustrated as an example GRB light-curve in Figure 2: (1) the bolometric peak energy flux (P bol ); (2) the bolometric fluence (i.e., the total observed energy, S bol ); (3) the observed time-integrated spectral peak energy (E p ); (4) the observed prompt gamma-ray duration (e.g., as defined by the 90% of the prompt-emission interval, T 90 ). These four observer-frame attributes are readily available for all BATSE and Fermi-GBM, and many of the Swift-BAT GRBs and can be mapped to the corresponding GRB rest-frame properties via bolometric isotropic peak luminosity: L iso = P bol × 4π × d L (z) 2 , total bolometric isotropic emission: E iso = S bol × 4π × d L (z) 2 /(z + 1), intrinsic time-integrated spectral peak energy: E pz = E p × (z + 1), intrinsic prompt-emission duration: where d L (z) represents the cosmological luminosity distance as a function of redshift z, which can be readily computed for a given choice of cosmology (e.g., ΛCDM). Taking the logarithm of both sides of all equations, we obtain a set of linear mappings from the rest-frame to the observer-frame properties of GRBs in the logarithmic space. Therefore, the observed distributions of the four GRB properties result from the convolution of the distributions of the corresponding rest-frame GRB properties with the distributions of the logarithms of the mathematical terms that are exactly determined by z (i.e., d L (z) and z + 1). The larger the variances of the redshift-related terms in these equations are (relative to the variances of the distributions of intrinsic GRB attributes), the more the observed properties of GRBs will be affected, more so by the redshift as opposed to the intrinsic properties. The joint 4-dimensional distribution of the observed properties of SGRBs and LGRBs in both BATSE and Fermi catalogs strongly resembles multivariate log-normal distributions severely censored by the gamma-ray detection thresholds [7,36,37]. Such an orderly distribution shape in the observer frame implies an even more orderly log-normal shape for the joint distribution of the intrinsic attributes of SGRBs and LGRBs in the rest frame. Therefore, we will consider the multivariate log-normal as our primary statistical model describing the intrinsic population distribution of the prompt-emission properties in both GRB classes. Nevertheless, it is a common practice in astronomy to model the extensive properties (e.g., energetics and luminosity) of celestial objects with power-law distributions [9]. Therefore, we could also consider scenarios where the distributions of GRB energetics (L iso and E iso ) are jointly modeled as power laws combined with log-normal distributions for the intensive GRB properties (E pz and T 90z ). This would be particularly feasible for modeling the population distribution of LGRBs for which ample data and redshift information are available. Notably, we use the isotropically computed energetics of GRBs in this model. In reality, these isotropic values must be corrected by the beaming factor measured for each GRB. This observational information is seldom available. Nevertheless, we do not expect the lack of beaming factor correction to have any significant effects on the accuracy of the final results since the available theoretical and observational evidence points to narrow widths for the distributions of the beaming angles of the jets within each class of GRB [38][39][40][41][42]. In other words, such beaming factor corrections to the GRB energetics, even if possible, would only effectively shift the entire GRB data linearly in the logarithmic space of L iso and E iso . That is, the resulting variations due to the beaming corrections would be orders of magnitude smaller than the intrinsic variability in the energetics of LGRBs and SGRBs. GRB detector threshold model. An accurate estimation of the cosmic rates of GRBs also requires incorporating a detailed minimally biased model of the detection threshold of GRB detectors [37,43,44]. By design, some GRB detectors are significantly more difficult to simulate than others. For example, the Swift Burst Alert Telescope is well-known for its immensely complex triggering algorithm. It comprises at least three separate detection mechanisms [45] that complement each other: 1.
The first type of trigger is for short time scales (4 ms to 64 ms). These are traditional triggers (single-background), for which about 25,000 combinations of time-energyfocal plane subregions are checked per second.

2.
The second type of trigger is similar to that of HETE detectors [46]: fits to multiple background regions are made to remove trends for time scales between 64 ms and 64 s. About 500 combinations for these triggering mechanisms are checked per second. For these rate triggers, false triggers and variable non-GRB sources are also rejected by requiring a new source to be present in an image.

3.
The third type of trigger works on longer time scales (minutes) and is based on routine images that are made of the field of view.
By contrast, the Fermi-GBM detection mechanism is relatively similar to that of its predecessor, BATSE LADs. The GBM can trigger upon detecting GRBs at several independent timescales from 16 ms to 8.192 s. A naive approach to modeling the Fermi-GBM triggering mechanism would be to use the sensitivity measurements of the detector determined in laboratory settings by the Fermi team. The modeling of BATSE and Swift catalogs, however, provides evidence against such an approach [7,8,37,44]. This is because the operational sensitivities of gamma-ray photon-counters tend to differ from the sensitivities measured in isolated laboratory environments. Additionally, GRB catalogs do not form a complete sample with respect to the detection thresholds of the relevant gamma-ray detectors. Such simple methods of detector threshold modeling can easily lead to biases in GRB rate estimates that show surprising deviations from the expected GRB rates [1,47].
As an alternative, the effects of the gamma-ray detectors can be accounted for by modeling the detection threshold as part of the modeling of GRB properties [7,37]. For example, to remove the strong dependence of the detection threshold of Fermi-GBM on the timescale used for the definition of a GRB's peak flux, we can define an effective timescale-free peak flux for all BATSE and Fermi GRBs. The existence of such an effective peak flux is noticeable in the plot of the ratio of GRB peak fluxes at different timescales as a function of their observed durations. Figure 3 illustrates the strong dependence of the peak flux ratios at 64 ms and 1024 ms timescales on the durations of GRBs measured by T 90 in the BATSE and Fermi catalogs. The two GRB classes are segregated via fuzzy classification methods [37,48,49] applied to E p and T 90 of the events in both catalogs [50]. We have already shown [7,37] the utility of this relationship in removing the effects of varying-timescale detector thresholds from the BATSE catalog. A similar procedure can be adopted for modeling the detection thresholds of Fermi-GBM and Swift-BAT gamma-ray detector thresholds. In the case of Fermi-GBM, the difference is minimal and amounts to only a wider range of triggering timescales (0.016 s to 8.192 s) compared to that of BATSE (64 ms to 1.024 s). Furthermore, a realistic modeling of the detection threshold requires modeling the inherent fluctuations in the background gamma-ray photon counts as a Poisson process [22,37], leading to a fuzzy detection threshold at any given triggering timescale. This approach properly includes all GRBs in any catalog, down to the faintest events.

Model Calibration, Validation, and Selection
Once we implement the mathematical modeling approach described in Section 2.1 and the detection threshold model as laid out in Section 2.2, we can combine them to obtain probabilistic models for the observed rates of LGRBs and SGRBs in the given catalog of interest. The two LGRB and SGRB world models collectively yield the probability of observing the entire GRB catalog for a given set of input parameters. The most plausible parameters can be obtained by maximizing the likelihood of observing the catalog.
The last challenge lies in the maximization of the multivariate likelihood function resulting from the probabilistic models. This is due to the complex dependencies of the photon-counting detection mechanism of gamma-ray detectors in a limited energy window at different timescales on the intrinsic peak luminosity (L iso ), hardness (E pz ), duration (T 90z ), and redshift (z) of each GRB. Notably, the BATSE LADs and Fermi-GBM have the nominal triggering energy window of 50-300 keV and trigger only on particular timescales, as noted previously. Furthermore, the intrinsic fuzziness of the detection threshold (due to background fluctuations) creates a nontrivial 5-dimensional fuzzy cut through the constructed probabilistic GRB world models. Thus, computing the probability of detection of GRBs for each input parameter set requires recomputing the probabilistic model's normalization factor. In other words, calculating the likelihood of each parameter set of the models requires solving a 5-dimensional integration in the space of GRB properties and the gamma-ray detector characteristics.
Previous similar studies with the BATSE catalog data indicate that each computation of this multidimensional integral takes on the order of 100-1000 milliseconds [1,7,22,37]. The incorporation of data uncertainties in catalogs that are larger than BATSE catalog (such as Fermi-GBM) into this analysis via the Bayesian methods that we have detailed above will likely increase this computational cost by 1-2 orders of magnitude. As such, sampling the posterior probability density of the parameters of the GRB world models requires parallel Monte Carlo sampling algorithms. Existing software, such as ParaDRAM and ParaNest algorithms of the ParaMonte library [51][52][53][54] or MultiNest [55], are capable of distributing multiple simultaneous calculations of objective functions across a few processors in parallel.
In the presence of multiple competing GRB world models, the Bayesian probability theory offers a natural method of comparing and ranking multiple competing probabilistic world models for GRBs. This can be achieved by computing the plausibility of each model according to the Bayes rule. Consider, for example, the Bayesian problem of selecting the best model from a set of m competing models M = M 1 , ..., M m capable of describing all available data D. For each model M i in the set, the posterior distribution of the parameters can be written as where π(·) denotes the probability density function and θ represents the set of unknown parameters of the model M i . One can integrate the Bayes rule (Equation (1)) over the entire parameter space Θ of the model and rearrange the equation to obtain Equation (2) provides a method of computing the denominator of the Bayes rule in Equation (1), which, by definition, is the likelihood of observing data D in a given model M i . It is sometimes called marginal likelihood since it is calculated by marginalizing the likelihood function over the entire parameter space of the model. However, more frequently, it is known as the Bayesian evidence or model evidence or simply the evidence. The utility of evidence goes beyond just serving as a normalizing constant in the Bayes rule, as it can be used for the calculation of Bayesian plausibility in yet another rewriting of the Bayes theorem, this time for the set of models M: Equation (3) gives the posterior probability density of the ith model M i in the set of all rival models M, given the prior probability knowledge π(M i ) about model M i being correct. It is called the Bayesian plausibility, since it provides a measure of the plausibility of model assumptions in the light of available data and prior knowledge about the model. In the case of complete prior ignorance about all competing models, the Jaynes principle of maximum entropy [56] dictates the assignment of uniform equal prior probabilities to each of the competing models.
Despite the mathematical simplicity of Equation (3), its computation is frequently a challenging task. Nevertheless, the computation of Bayesian plausibility can be greatly simplified via analytical approximations to Equation (3) that are valid only under certain assumptions and asymptotic behaviors, such as in the limit of large datasets or when the posterior distribution of the parameters of the models can be well-approximated by a multivariate normal distribution. In such cases, approximate methods such as the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC) offer simple elegant solutions to model comparison [50,57]. In the special case of modeling GRB catalogs, however, the full numerical computation of the Bayesian plausibility for the competing models might be necessary. The aforementioned Monte Carlo sampling and integration tools can handle such intricate, computationally expensive numerical integrations.

Application to the BATSE Catalog of SGRBs
We now present an implementation of the Bayesian probabilistic approach to reconstructing the missing data (e.g., redshifts) of BATSE SGRBs.

Observational Data
We follow the same fuzzy clustering procedure applied to the BATSE catalog data as described in Shahmoradi and Nemiroff [7] to segregate SGRBs from LGRBs in the BATSE catalog. The resulting 565 SGRB sample that we obtain and use in this study is identical to that of Shahmoradi and Nemiroff [7]. Following Shahmoradi [37], Shahmoradi and Nemiroff [7], and Osborne et al. [1], we use the four observational prompt gamma-ray emission properties of SGRBs available in the BATSE catalog to constrain the population properties and the unknown redshifts of individual BATSE SGRBs (Figure 2).

Model Construction
The crucial step in modeling the population properties of BATSE SGRBs is to realize that one can use the existing prior knowledge about the overall cosmic redshift distribution of SGRBs to integrate over all possible redshifts for each observed SGRB in the BATSE catalog to infer a range of plausible values for the intrinsic properties of the corresponding SGRBs. These individually computed probability density functions (PDFs) of the intrinsic properties can be then used to infer the unknown parameters of the joint population distribution of the intrinsic properties of SGRBs.
Once the SGRB world model parameters are constrained, we can use the inferred population distribution of the intrinsic SGRB properties together with the observed properties to estimate the redshifts of individual BATSE SGRBs, independently of each other. The estimated redshifts can be again used to further constrain the intrinsic properties of SGRBs, which will then result in even tighter estimates for the individual redshifts of BATSE SGRBs. This recursive modeling can theoretically continue until convergence to a set of fixed individual redshift estimates occurs, although practical considerations frequently limit the procedure to only one cycle.
The lack of knowledge of the cosmic rate of SGRBs proves to be the largest source of uncertainty in SGRB population studies. At first glance, the above simple semi-Bayesian mathematical approach may sound like magic and perhaps, too good to be true. Sometimes it is. However, as explained in the previous sections, it can also lead to reasonably accurate results if some conditions regarding the problem and the observational dataset are satisfied.
Let D lc obs,i represent the ith SGRB event in the BATSE catalog, with the four main SGRB prompt emission properties, Here, the superscript lc is used to indicate that data are extracted exclusively from the light-curves of the events. These are essentially the values reported both in the BATSE catalog and in Shahmoradi and Nemiroff [58].
The entire BATSE dataset of 565 SGRB events attributes can then be represented as the collection of these events, The peak brightness, P bol , is included in our GRB world model as it, along with E p , determines the peak photon flux, P ph , in the 50-300 keV range (the BATSE nominal detection energy window).
Given the available observed BATSE dataset, D lc obs , the primary goal now is to constrain the probability density functions of the redshifts of individual BATSE SGRBs. To do this, the process of SGRB observation is modeled as a nonhomogeneous Poisson process whose mean rate parameter is the 'observed' cosmic SGRB rate, R obs .
Each SGRB can be described as having the intrinsic properties in the 5-dimensional attributes space, Ω(D int ), of the 1024 ms isotropic peak luminosity (L iso ), the total isotropic emission (E iso ), the intrinsic spectral peak energy (E pz ), and the intrinsic duration (T 90z ), as a function of the parameters, θ obs , of the observed SGRB rate model, R obs . The probability of these SGRBs occurring with the given properties is then given by π D int,i |R obs , θ obs ∝ R obs D int,i , θ obs (8) where the term R obs represents the BATSE-censored rate of SGRB occurrence in the universe. This can also be rewritten in terms of the intrinsic cosmic SGRB rate, R int , along with the BATSE detection efficiency function, η eff , as for a given set of input intrinsic SGRB attributes, D int , with θ obs = {θ eff , θ int } as the set of the parameters of our models for the BATSE detection efficiency and the intrinsic cosmic SGRB rate, respectively. Assuming that there is no systematic evolution of SGRB characteristics with the redshift, the intrinsic SGRB rate itself can be written as int is a statistical model, with θ lc int denoting its parameters, that describes the population distribution of SGRBs in the 4-dimensional attributes space of D lc int = [L iso , E iso , E pz , T 90z ], and the termζ(z, θ z ) represents the comoving rate density model of SGRBs with the set of parameters θ z , while the factor (1 + z) in the denominator accounts for the cosmological time dilation. The comoving volume element per unit redshift, dV dz , is given by [59,60] dV with Ω M and Ω Λ representing the dark matter and dark energy densities, respectively, and d L standing for the luminosity distance given by where C represents the speed of light and H 0 represents the Hubble constant. The cosmological parameters in Equation (12) are set to h = 0.70, Ω M = 0.27, and Ω Λ = 0.73 [61]. If the three rate models, (ζ, η eff , R lc int ), and their parameters were known a priori, one could readily compute the PDFs of the set of unknown redshifts of all BATSE SGRBs, as π Z |D lc obs , R obs , θ obs ∝ R obs Z, D lc obs , θ obs .
For a range of possible parameter values, the redshift probabilities can be computed by marginalizing over the entire parameter space, Ω(θ obs ), of the model: π Z | D lc obs , R obs = Ω(θ obs ) π Z |D lc obs , R obs , θ obs × π θ obs |D lc obs , R obs dθ obs .
The problem, however, is that neither the rate models nor their parameters are known a priori. Even more problematic is the circular dependency of the posterior PDFs of Z and θ obs on each other: To break this circular dependency, we can adopt the empirical Bayes approach described in Equation (2) to estimate the redshifts of BATSE SGRBs. First, we propose models for (ζ, η eff , R lc int ), whose parameters have yet to be constrained by observational data. Given the three rate models, we can then proceed to constrain the free parameters of the observed cosmic SGRB rate, R obs , based on BATSE SGRB data.
The most appropriate fitting approach should take into account the observational uncertainties and any prior knowledge from independent sources. This can be achieved via the multilevel Bayesian methodology [22] by constructing the likelihood function and the posterior PDF of the parameters of the model while taking into account the uncertainties in observational data (see Equation (61) in [22]): π θ obs |D lc obs , R obs where Equation (17) holds under the assumptions of independent and identical distribution (i.e., the i.i.d. property) of BATSE SGRBs and there is no measurement uncertainty in the observational data, except redshift (z), which is completely unknown for BATSE SGRBs.
Once the posterior PDF of the model parameters is obtained, it can be plugged into Equation (15) to constrain the redshift PDF of individual BATSE SGRBs at the second level of modeling.

The SGRB Redshift Prior Knowledge
The main assumption in this work is that SGRBs are due to the coalescence of binary neutron stars or the merger of a neutron star and a black hole. It is widely believed that binary mergers require significant cosmological time to occur after the deaths of the parent stars and the formations of the neutron stars. In this scenario, the cosmic rate of SGRBs follows the Star Formation Rate (SFR) convolved with a distribution of the delay time between the formation of a binary system and its coalescence due to gravitational radiation.
There is currently no consensus on the statistical moments and shape of the distribution of the delay time between the deaths of supermassive stars and their subsequent coalescence to form SGRBs, solely based on observations of individual events and their host galaxies. The median delays vary widely in the range of ∼0.1-7 billion years, depending on the assumptions involved in estimation methods or in the dominant binary formation channels considered. Recent results from population synthesis simulations, however, favor very short delay times of a few hundred million years with long, negligible tails towards several billion years [7].
The extreme computational expenses imposed on this work by the complex mathematical models strongly limit the number of possible scenarios that could be considered for the cosmic rate of short GRBs. Thus, in order to approximate the comoving rate densitẏ ζ(z) of SGRBs, we adopt the Star Formation Rate (SFR) model,ζ, described in [7] in the form of a piecewise power-law function, with parameters This SFR model is then convolved with a log-normal model of the delay time distribution [7,62] with parameters [µ, σ] = [log(0.1), 1.12] in units of billion years (Gyrs) adopted from [7], such that the comoving rate density of SGRBs is calculated aṡ with the universe's age t(z) at redshift z given by 3.4. The SGRB Properties Rate Model: R lc int As for the choice of the statistical model for the joint distribution of the four main intrinsic properties of SGRBs, D lc int , a multivariate log-normal distribution, R lc int ≡ LN , is assumed in this work, whose parameters (i.e., the mean vector and the covariance matrix), θ lc int = {µ, Σ}, will have to be constrained by data. The justification for the choice of a multivariate log-normal as the underlying intrinsic population distribution of SGRBs is multifolded. First, the observed joint distribution of BATSE SGRB properties highly resembles a log-normal shape [36] that is censored close to the detection threshold of BATSE. Second, unlike the power-law distribution which has traditionally been the default choice of model for the luminosity function of SGRBs, log-normal models provide natural upper and lower bounds on the total energy budget and luminosity of SGRBs, eliminating the need for setting artificial sharp bounds on the distributions to properly normalize them. Third, the log-normal and Gaussian distributions are among the most naturally occurring statistical distributions in nature, whose generalizations to multidimensions are also well-studied and understood. This is a highly desired property especially for our work, given the overall mathematical and computational complexity of the model proposed and developed here.

The BATSE Detection Threshold: η eff
Compared to Fermi-GBM [63] and Swift-BAT [64], BATSE has a relatively simple triggering algorithm. The BATSE detection efficiency and algorithm have been already extensively studied by the BATSE team as well as independent authors [7,37,44,58]. However, the simple implementation and usage of the known BATSE trigger threshold for modeling the BATSE catalog's sample incompleteness can lead to systematic biases in the inferred quantities of interest. Out of BATSE triggered on 2702 GRBs, only 2145, or approximately 79%, have been consistently analyzed and reported in the current BATSE catalog, with the remaining 21% either having a low accumulation of count rates or missing full spectral/temporal coverage [7]. Thus, the extent of sample incompleteness in the BATSE catalog is likely not fully and accurately represented by the BATSE triggering algorithm alone. BATSE LADs generally triggered on a GRB if the number of photons per 64, 256, or 1024 ms arriving at the detectors in the 50-300 keV energy window, P ph , reach a certain threshold in units of the background photon count fluctuations, σ. This threshold was typically set to 5.5 σ during much of BATSE's operational lifetime. However, the naturally occurring fluctuations in the average background photon counts effectively lead to a monotonically increasing BATSE detection efficiency as a function of P ph , instead of a sharp cutoff on the observed P ph distribution of SGRBs.
Although the detection efficiency of most gamma-ray detectors depends solely on the observed peak photon flux in a limited energy window, the quantity of interest that is most often modeled and studied is the bolometric peak 'energy' flux (P bol ). This variable depends on the observed peak photon flux and the spectral peak energy (E p ) for the class of LGRBs [37] and also on the observed duration (e.g., T 90 ) of the burst for the class of SGRBs [7]. The effects of GRB duration on the peak flux measurement are very well illustrated in the left plot of Figure 3, where it is shown that for BATSE GRBs with T 90 1024 ms, the timescale used for the definition of the peak flux does indeed matter. This is particularly important in modeling the triggering algorithm of BATSE Large Area Detectors when a short burst can be potentially detected on any of the three different peak flux timescales used in the triggering algorithm: 64 ms, 256 ms, and 1024 ms. Therefore, the detection modeling approach of Shahmoradi and Nemiroff [7] is adopted to construct a minimally biased model of BATSE trigger efficiency for the population study of shorthard bursts.

Results
Now, with a statistical model at hand for the observed rate of short GRBs, we proceed by first fitting the proposed censored cosmic SGRB rate model R obs to 565 BATSE SGRB data under the redshift distribution scenario prescribed in the previous section. The posterior PDF of parameters of the cosmic rate model of SGRBs is explored by the Parallel Delayed-Rejection Adaptive Metropolis-Hastings Markov Chain Monte Carlo algorithm (the ParaDRAM algorithm) that is part of the larger Monte Carlo simulation package ParaMonte available in C, C++, Fortran, MATLAB, Python, and other programming languages available online (as of 31 March 2022) at https://github.com/cdslaborg/paramonte [52,53,65,66].
However, due to the complex truncation imposed on SGRB data and the world model by the BATSE detection threshold, the maximization of the posterior distribution of the parameters of the cosmic rate model of SGRBs is not only analytically intractable but also computationally extremely complex. Calculation of the posterior distribution as given by Equation (17) requires a multivariate integral over the four-dimensional space of SGRB variables at any given redshift. In addition, due to lack of redshift (z) information for BATSE SGRBs, the probability for the observation of each SGRB given the model parameters must be marginalized over all possible redshifts, adding another layer of integration to the four-dimensional integration. These numerical integrations make sampling from the posterior distribution of the parameters of the SGRB cosmic rate model an extremely difficult task. Therefore, the inclusion of the measurement uncertainties, which would make the computations far more complex, is not considered in this work.
The joint posterior distribution of the model parameters is then obtained by iterative sampling using a variant of Markov Chain Monte Carlo (MCMC) techniques known as Adaptive Metropolis-Hastings [67]. To further the efficiency of MCMC sampling, we implement all algorithms in Fortran [68,69] and approximate the numerical integration in the definition of the luminosity distance of Equation (12) by the analytical expressions of Wickramasinghe and Ukwatta [70]. This integration is encountered on the order of one billion times during the MCMC sampling of the posterior distribution.
The computations were performed on 96 processors in parallel on two Skylake compute nodes of the Stampede 2 supercomputer at Texas Advanced Computing Center. We performed extensive tests to ensure a high level of accuracy of the high-dimensional numerical integrations involved in the derivation of the posterior distribution of the parameters of the censored cosmic rate model for SGRBs as given in Equation (17). The resulting best-fit parameters of the cosmic SGRB rate model are summarized in Table 1, and the marginal distributions of their parameters are compared with each other in Figure 4.
Once the parameters of the censored cosmic rate model Equation (9) are constrained, we use the calibrated model at the second level of the analysis to further constrain the PDFs of the unknown redshifts of individual BATSE SGRBs according to Equation (15). This iterative process can continue until convergence to a specific set of redshift PDFs occurs. However, given the computational complexity and the expense of each iteration, the iterative refinement process is stopped after obtaining the first round of estimates. Table 1. Mean best-fit parameters of SGRB world model compared to LGRB world model of Shahmoradi [37].

Parameter SGRB World Model LGRB World Model
Redshift Parameters (Equation (18)  The mean redshifts together with 50% and 90% prediction intervals for the three rate density scenarios are also reported in Table 2. On average, the redshifts of individual BATSE SGRBs can be constrained to within a 50% uncertainty range of 0.51. At a 90% confidence level, the prediction intervals expand to wider a uncertainty range of 1.31. Figure 5 shows the derived probability density functions (PDFs) of a subset of 565 BATSE SGRBs. As illustrated, the redshifts of BATSE SGRBs are generally better constrained at lower redshifts.

Discussion and Concluding Remarks
In this work, a semi-Bayesian data-driven methodology was proposed to infer the unknown redshifts of 565 BATSE catalog SGRBs. Towards this, first, the two populations of BATSE LGRBs and SGRBs were segregated using the fuzzy C-means classification method based on the observed durations and spectral peak energies of 1966 BATSE GRBs with available spectral and temporal information. Then, the process of SGRB detection was modeled as a nonhomogeneous spatiotemporal Poisson process whose rate parameter was modeled by a multivariate log-normal distribution as a function of the four main SGRB intrinsic attributes: the 1024 ms isotropic peak luminosity (L iso ), the total isotropic emission (E iso ), the intrinsic spectral peak energy (E pz ), and the intrinsic duration (T 90z ). To calibrate the parameters of the rate model, a fundamental assumption was made: SGRBs trace the Cosmic Star Formation Rate (SFR) convolved with a model for the binary neutron star merger delay distribution. Then, the resulting posterior probability densities of the model parameters were used to compute the probability density functions of the redshifts of individual BATSE SGRBs.
Although sample incompleteness may strongly affect an observational dataset, the proposed semi-Bayesian modeling framework enables us to overcome the limitations of the observational samples and missing data via reasonable prior distributions and appropriate modeling of the potential biases [71] present in observational data. The proposed methodology is different from and offers a parametric alternative to the existing nonparametric methods for quantifying the impact of missing data [72,73]. While generic and applicable to a wide range of research problems and datasets, this parametric probabilistic method requires certain assumptions to be met regarding the observational data to yield reasonably accurate unbiased constraints on the missing data. Most importantly, the effectiveness of the method correlates strongly and positively with the quality of data and the impact of the unknown components of data (e.g., redshift) on the known (observed) data. These two factors can together explain the significant difference between the tight constraints that Osborne et al. [1] obtain on the individual redshifts of BATSE LGRBs and the inferred redshifts of BATSE SGRBs in this work, as illustrated in Figure 5. We expect the Swift-BAT and particularly the Fermi-GBM catalogs to yield significantly tighter constraints on the unknown individual redshifts of Swift and Fermi LGRBs and SGRBs, due to the higher quality of data and the availability of redshift information for a significant number of events in these catalogs. This will ultimately lead to better independent estimates of the cosmic rates of LGRBs and SGRBs and their improved utilities in constraining the rate of GWR events. This will also enable the implementation of our proposed probabilistic framework for validating the inferred redshifts in these catalogs, an issue that remains untouched in present work due to the lack of any measured redshifts in the BATSE (SGRB) catalog. Nevertheless, Osborne et al. [1] show that our proposed methodology is capable of constraining the redshifts of GRBs in the presence of sufficient high-quality data.