Blinkverse: A Database of Fast Radio Bursts

The volume of research on fast radio bursts (FRBs) observation have been seeing a dramatic growth. To facilitate the systematic analysis of the FRB population, we established a database platform, Blinkverse (https://blinkverse.alkaidos.cn), as a central inventory of FRBs from various observatories and with published properties, particularly dynamic spectra from FAST, CHIME, GBT, Arecibo, etc. Blinkverse thus not only forms a superset of FRBCAT, TNS, and CHIME/FRB, but also provides convenient access to thousands of FRB dynamic spectra from FAST, some of which were not available before. Blinkverse is regularly maintained and will be updated by external users in the future. Data entries of FRBs can be retrieved through parameter searches through FRB location, fluence, etc., and their logical combinations. Interactive visualization was built into the platform. We analyzed the energy distribution, period analysis, and classification of FRBs based on data downloaded from Blinkverse. The energy distributions of repeaters and non-repeaters are found to be distinct from one another.


Introduction
Fast radio burst (FRB) is a type of bright single pulse at radio frequencies, with enormous energy generation of millisecond duration. Among over 700 FRBs that have now been reported since the first discovery in 2007 [1], the majority of FRB discoveries have been found to be one-offs. The number of current repeating sources has reached up to 63. With the rapid increase in the number of FRB discoveries in recent years, a mass of remarkable breakthroughs have been made in the research of FRBs, such as repeaters [2][3][4][5], burst characteristics [6,7], ambient environment [8][9][10][11][12][13], and host galaxies [14][15][16][17].
The number of FRB discoveries has increased greatly, which also demands higher requirements for data collation and analysis. Several databases are currently available for a range of FRB properties. The Transient Name Server (TNS) 1 is the official IAU mechanism for reporting new astronomical transients (ATs) including FRBs. The Fast Radio Burst Catalogue (FRBCAT) 2 is a specific repository for FRB properties but has not been actively updated since July 2020 [18]. The contents of FRBCAT have been migrated to the TNS. FRBSTATS 3 provides a platform for recording FRB bursts and a visualization interface to plot the parameter distributions [19]. Meanwhile, a clustering method, called densitybased spatial clustering of applications with noise (DBSCAN), has been applied in the FRBSTATS platform to distinguish repeaters from non-repeaters automatically. Compared with the available databases mentioned above, our Blinkverse database possesses the most comprehensive information on published FRBs and a dynamic visualization platform for fruitful statistical results. Researchers can obtain the target data by constraining one or more parameter of the FRB. The searching capability of Blinkverse is stronger than previous databases.
In the following sections, the architecture of the database platform will be introduced in Section 2, including data description and data availability. The advantages of Blinkverse compared to other databases will be subsequently listed in Table 2. Section 3 will provide several examples of data analysis using data readily downloaded from our database, such as energy distribution, period analysis, and classification of FRBs. Concluding remarks are provided in Section 4.

Platform Architecture
MongoDB, a multi-cloud database service, offers a suitable NoSQL database to serve as the catalogue infrastructure for the platform. This platform is separated into three modules ( Figure 1): overview, data description, and data availability. The various statistical charts on the homepage display an overall overview of the data from Blinkverse. The schema lists and description of the parameters are available in the data description. The display format of the data is divided into two types: FRB source information and pulse information, among which the repeated bursts of pulse information are listed separately. This database is under development and will be improved in the next 2-3 years under sufficient investigation to make it more useful.  Figure 2 displays a homepage with a statistical overview of the observed events. A celestial map is displayed in the middle of this page. All recorded FRBs have been marked on this map with white dots for non-repeaters and red dots for repeaters. The interactive operation allows users to click one of the FRBs on the map to obtain the information they want. An individual visualization page ( Figure 3) has the same effect for choosing an interesting event.

Overview
The number of FRB discoveries is displayed below the celestial map. A total of 735 FRBs covers 63 repeaters and 672 non-repeaters. Over 500 FRBs have been discovered in 400-800 MHz by the Canadian Hydrogen Intensity Mapping Experiment (CHIME) with a large collecting area and wide field of view since 2018 [20], whereas most FRB discoveries before 2018 were made at Parkes radio telescope [21].
A pie chart in Figure 2 displays the count of pulse detections of FRBs from various telescopes. The quantity of all the FRB bursts reaches up to~5600, which mostly contributes to the repeater of FRB20121102A and FRB20201124A detected by the Five-hundred-meter Aperture Spherical radio Telescope (FAST) with high sensitivity [5,22].

Data Description and Data Availability
We reviewed multiple observational papers and relevant database websites to identify common parameters used for characterizing the properties of FRBs [2][3][4][5]22,23]. All the data in the database were obtained from various studies in the literature and datasets. The relevant links to the references for each burst are provided on Blinkverse. Based on our findings, we proposed new schema lists (see Table 1) with improved descriptions of various aspects of FRBs. Two types of schema list have been created to record the information on the burst properties and positions of FRB sources. We may add or modify parameters if necessary in the future. Figure 4 shows the generic search options and the portion of the FRB properties. The generic search for FRB sources includes telescope, observational date, FRB name, or position. In addition, we also provide an advanced search that supports logical relationship statements for convenient searching of specified parameters or a combination of parameters. Based on the already searched FRB sources, users can further select the desired parameters to download. The way to obtain data from the database is simple and flexible. We provide a download button on the website. Users can choose the parameters we provided (see Table 1) and click the download button to obtain the data in CSV format. Additionally, we also provide an online mapping service, where choosing the parameters for the x-axis and y-axis can facilitate drawing curves or scatterplots online.
The burst properties and positions of FRB sources are restored in the database using the name of the FRB (for example, "FRB20121102A") as a connector. The name of each FRB has been marked with a label of "REPEAT" or "NON_REPEAT" in the database to distinguish between repeaters and non-repeaters. A separate page is designed for repeaters due to their significance in research. Users can click on a repeater FRB and see all of its individual bursts and dynamic spectra.   Reference -URL of the burst discovery paper where the event was first reported 1 MJD is corrected to the solar system barycenter and referenced to infinite frequency. Considering the fact that the arrival time of the pulse is influenced by the motion of the Earth, the arrival times of the pulse are transformed to the solar system barycenter using the software pintbary [28]. 2 The unit of energy is 10 37 erg. 3 [27]. We just record the data from references without any modification. The value of DM is empty when these parameters are absent in the literature. Table 2 compares our Blinkverse platform with other main data websites. The Blinkverse database is a comprehensive platform and includes information from multiple observation devices, multiple bands, FRB host galaxies, corresponding dynamic spectrum charts, diverse visualization, and a simplified interface. An explanation of Table 2 is provided below:

Comparison with Other Data Websites
Telescope: The databases in Table 2, except CHIME/FRB, contain a large number of data obtained by various telescopes. CHIME/FRB is a special database that only preserves the FRB data obtained by the CHIME telescope.
Host galaxy: All the databases record "ra" and "dec" to describe the position of an FRB. FRBCAT calculates and records "redshift" in addition.
Dynamic spectra: We provide an interactive interface to show the dynamic spectra of FRB bursts. For a specific FRB source, the burst spectrum from every different epoch can be readily queried and presented. This offers users a much more readable data visualization platform, and as a consequence, the user will be able to identify each spectrum easily and conduct more efficient data analysis. This feature surpasses other databases or studies in the literature where all the spectra are usually only presented on a single collective figure.
Search: We provide the generic search for FRB sources including telescope, observational date, FRB name, or position, and the advanced search that supports logical relationship statements. Conversely, only FRB names can be searched in CHIME and FRBCAT.
Visualization: TNS, FRBCAT, FRBSTATS, and CHIME/FRB only provide lists of FRB bursts without any visualization. Blinkverse has an interactive visualization interface. The positions of FRBs are marked on the celestial sphere.
Update: The frequency of the database update is not regular according to our experience. In contrast, Blinkverse is updated regularly every week.
Download: Download formats supported by the database.

Examples of Data Mining with Blinkverse
Users can easily access data from the Blinkverse database via the REST API, which is an API that conforms to the design principles of REST, using the requests module. The well-defined data structure enables straightforward data analysis. Users can download the data from the website and read it into a DataFrame format using pandas, or directly access DataFrame format data by calling the API using the provided sample code. Here, we present several simple examples of data analysis.
Upon obtaining the data, the first step is to check their distribution. Taking energy as an example, we replicated the energy distribution shown in [5] for FRB 20121102A using seaborn.displot. By filtering out bursts from source FRB 20121102A and MJD > 58724, we can show the bimodal energy distribution, as in Figure 5.
Furthermore, the Blinkverse database contains relatively complete and long-term burst information, making the search for long FRB periods possible and easy. Similarly, we used FRB 20121102A as an example to search for its long period. We extracted the MJD column from the data filtered by source FRB 20121102A, and used scipy.signal.lombscargle to calculate the power of the period in the range of 2-365 days. In Figure 6, we reproduce the 157-day period of FRB 20121102A [29,30].   The Blinkverse database records various properties for bursts, making multi-parameter analysis or FRB classification possible. We selected bursts having DM, Flux, Fluence, Width, and Freq, and attempted to classify FRBs using these five parameters.
Here, we used two methods, decision trees and random forests, to show the classification of FRB. Decision trees are a supervised learning method that uses a tree-like structure to represent decision rules to solve classification problems [31]. Random forests are an ensemble learning algorithm composed of multiple decision trees. Their basic idea is to construct different decision trees by randomly selecting samples and features, and then vote or average the classification results of each tree to obtain the final prediction [32]. Random forests have high accuracy and generalization performance.
The confusion matrix is a table used to evaluate the performance of a classification model, showing the number of correct and incorrect predictions of the model for each class. The confusion matrix after fitting the data with random forests is shown in Figure 7, indicating that only a small number of bursts are misclassified. The majority of bursts can be correctly predicted by the model. In addition, by examining the importance of parameters in the random forest model, it can be seen that Bandwidth and Fluence contribute the most to FRB classification. This is consistent with previous research, indicating that non-repeating FRBs typically are brighter and have wider bandwidth than repeating FRBs [33][34][35][36][37].

NonRepeater
Repeater As decision trees are prone to overfitting, we only used a two-level decision tree to classify FRBs (Figure 8), and similarly, we found that Bandwidth was the most important parameter for distinguishing between repeating and non-repeating FRBs. We calculated the values of energy to classify the repeaters and non-repeaters. The isotropic equivalent burst energy is calculated following the equation where z is the redshift, if the redshift is measured by the emission lines detected in the high-S/N LRIS spectrum, the z-value is used to calculate the luminosity distance (D) adopting the standard Planck cosmological model [38]. If the redshift is not measured, the distance and redshift can be calculated using the YMW16 electron density model [27]; F = S ν W eq is the specific fluence, S ν is the peak specific flux, and ν is the observed frequency of each pulse. We calculated the energy distributions of repeated and non-repeated bursts separately, as shown in Figure 9. Using the K-S test, we obtained a p-value of 0.0097, which is less than 0.05, indicating that the distributions of the two groups are different. Consistent with CHIME observations, repeater bursts have a longer duration and are narrower in bandwidth than non-repeater bursts [39]. The differences between the two groups can be verified by several parameters.

Conclusions
We have developed a comprehensive open-access FRB database named Blinkverse. The main characteristics of Blinkverse include the following: (1) Blinkverse has 30 parameters, such as fluence, frequency, energy, polarization, etc. (see Table 1), which are more comprehensive than those in FRBCAT.
(2) Blinkverse has an interactive visualization interface that TNS, CHIME/FRB, and FRBSTATS do not have. The positions of FRBs are marked on the celestial sphere. Users can click on the map to obtain sources and their parameters.
(3) FRB sources can be retrieved through Blinkverse based on parameter searches and their logical combinations, making it more versatile and accessible than TNS.
(4) Blinkverse is updated weekly. (5) Blinkverse facilitates the systematic analysis of the FRB population and its multiparameter characteristics. As an example, we utilized Blinkverse to find that the energy distributions of repeaters and single events are distinct from each other.

Data Availability Statement:
The data presented in this study are openly available in major database sets such as https://www.chime-frb.ca/ for CHIME/FRB and https://www.wis-tns.org/ for reported FRBs and multiple observational papers. The relevant links to the references of each burst are provided on Blinkverse. researchers who have contributed to the study of fast radio burst observations.

Conflicts of Interest:
The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript: