Exploring New Redshift Indicators for Radio-Powerful AGN

Active Galactic Nuclei (AGN) are relevant sources of radiation that might have helped reionising the Universe during its early epochs. The super-massive black holes (SMBHs) they host helped accreting material and emitting large amounts of energy into the medium. Recent studies have shown that, for epochs earlier than $z~{\sim}~5$, the number density of SMBHs is on the order of few hundreds per square degree. Latest observations place this value below $300$ SMBHs at $z~{\gtrsim}~6$ for the full sky. To overcome this gap, it is necessary to detect large numbers of sources at the earliest epochs. Given the large areas needed to detect such quantities, using traditional redshift determination techniques -- spectroscopic and photometric redshift -- is no longer an efficient task. Machine Learning (ML) might help obtaining precise redshift for large samples in a fraction of the time used by other methods. We have developed and implemented an ML model which can predict redshift values for WISE-detected AGN in the HETDEX Spring Field. We obtained a median prediction error of $\sigma_{z}^{N} = 1.48 \times (z_{\mathrm{Predicted}} - z_{{\mathrm{True}}}) / (1 + z_{\mathrm{True}}) = 0.1162$ and an outlier fraction of $\eta = 11.58 \%$ at $(z_{\mathrm{Predicted}} - z_{{\mathrm{True}}}) / (1 + z_{\mathrm{True}})>0.15$, in line with previous applications of ML to AGN. We also applied the model to data from the Stripe 82 area obtaining a prediction error of $\sigma_{z}^{N} = 0.2501$.


Introduction
Super-Massive Black Holes (SMBHs) might be ubiquitous to all galaxies above a certain mass. Understanding their true role in the shaping of galaxies will require a more precise census of the nature, growth, and evolution of SMBHs-in the so-called Active Galactic Nuclei (AGN) phases-, as well as a more detailed characterisation of the internal (secular) and external (environment) processes at work within the host [1].
Radio selection has been traditionally a prime wavelength for the detection of AGN activity. Between 10-20% of AGN have strong radio emission, in many cases in the form of jets, that can overshadow the radio emission associated to star-forming regions, mostly due to super-novae [2]. Radio selection efficiency though seems to decrease towards the Epoch of Reionisation (EoR. z > 6, e.g., Reference [3][4][5][6]). Simulations (e.g., Reference [7][8][9]) predict that the distribution of AGN and Radio Galaxies (RG) along redshift can lead to the detection of a few hundreds of objects per square degree at the EoR as the with deep observations planned arXiv:2111.00778v1 [astro-ph.GA] 1 Nov 2021 for future observatories, e.g., SKA, µJy sensitivity levels [10]. These expectations collide with the most recent compilations (see, for instance, Reference [11,12]), which show that only ∼300 sources have been confirmed to exist at z > 6 over most of the sky. Environmental (CMB) and intrinsic (QSO versus radio mode accretion) conditions might be responsible for the lower rate of radio powerful sources at z > 5 but, selection criteria, might also be playing a role [13]. Nevertheless, current radio instruments and recently completed surveys [14][15][16][17] have allowed detection of larger numbers of RG (e.g., Reference [18][19][20]) that could be used to better understand the origin and evolution of radio emission in AGN.
To place radio AGN in the proper cosmological context and derive their intrinsic properties, and given the time constrains imposed for the compilation of significant spectroscopic samples, alternative estimates for redshift need to be used. Template-based photometric redshifts have proven to be an efficient alternative by trading precision for sky coverage. The sizes of the new catalogues though, with tens to hundred of millions of sources, imply a significant-and ever increasing-investment in computational time. These issues raise the need for additional approaches which might be able to obtain the redshift information for a large number of astrophysical sources with enough precision and within a reasonable amount of time.
The tremendous increase of computing power over the last decades has allowed the application of evolved statistical methods in the analysis of large and complex datasets. Using previously-fed data, it is possible to predict, with relevant confidence, the behaviour new data will have. This is what has been called Machine Learning (ML). In Astrophysics, ML has been used in a wide range of subjects (in AGN and other types of sources), such as redshift determination (e.g., Reference [21,22]), morphological classification (e.g., Reference [23][24][25][26][27]), source selection and classification (e.g., Reference [28][29][30][31]), image and spectral reconstruction (e.g., Reference [32]), and more [33,34]. Despite its range of applications, ML has received some criticism related to the interpretability of the derived models, e.g., most ML models cannot provide coefficients that allow to create an analytical expression for example [35]. This implies that it may not be straightforward to understand the exact role that the measured properties have into the prediction a model might make.
Recent work has been done to improve interpretability. Feature importance [36] can be derived, mostly for Tree-Based models, i.e., models that use decision trees to classify or predict properties. In this scenario, a feature with a high importance will be, in general, in the higher levels of the decision trees used for the modelling. A different method for assessing the impact of features is that of Shapley Values [37]. Opposite to feature importance, Shapley values, which have been defined in the context of Game Theory to determine the contribution of a player in a cooperative game, can help in understanding how the features impact each individual prediction. A more thorough description on how Shapley values work can be seen in Molnar [38].
Astronomical data is very heterogeneous in its current form, with small areas of the sky covered extensively at all wavelengths and with high sensitivity but also larger areas with sparser multiwavelength coverage. Therefore, the homogeneous and deep multiwavelength coverage required for the most accurate models can only be achieved over a few to tens of degrees. The derived models in these fields could then be applied first to present surveys with less extensive and deep multiwavelength coverage (e.g., LoTSS, Stripe82, RACS, MIGHTEE, etc.) and then also to the upcoming all-sky surveys, e.g., SKA, LSST, eROSITA, etc., delivering observations with comparable depth and multiwavelength coverage as current small fields.
In this work, we describe an ML model aiming to predict the redshift for AGN based on the multiwavelength properties of the HETDEX Spring Field with the minimum amount of data preparation possible. The model will be then tested in data from the SDSS Stripe 82 Field where multiwavelength coverage and depth vary with respect to the HETDEX Spring Field. This approach would test the validity of the derived ML model based on a field into other fields with slightly different spectral coverage.
The structure of this article is as follows. In Section 2, we present the used data, its preparation for ML training and describe the selection of models and the metrics used for assess their results. In Section 3, the results of model training and validation are shown, as well as the predictions over the Stripe 82 field. We present the discussion of our results in Section 4. Finally, in Section 5, we summarise our work.

Data
We have selected all the detections listed on the CatWISE2020 catalogue (CW, [39]) that are located in the HETDEX Spring Field and that have been covered by the LOFAR DR1 measurements [17] (see Figure 1). The CatWISE2020 catalogue has measurements in the WISE bands W1 and W2 (at 3.4 µm and 4.6 µm, respectively), with a 90% completeness depth at W1 = 17.7 mag and W2 =17.5 mag. LOFAR observations cover an area of 424 deg 2 , with a median sensitivity of 71µJy/beam and a 6 resolution. In that way, we have obtained 6,729,647 detected sources. The sources have been then cross-matched with other catalogues in different wavelengths using a search radius of 5 . We have selected surveys with large sky coverages, such as: VLASS (3 GHz) [16], LoTSS-DR1 (150 MHz) [17], Pan-STARRS DR1 [41,41], GALEX AIS GR6+7 [42], GMRT 150 MHz all-sky [43], 4XMM-DR9 [44], 2MASS All-Sky [45,46], and AllWISE (AW [47]). The 20 selected photometric bands are listed in Table 1. To homogenise photometric measurements, we converted all fluxes and magnitudes to AB magnitudes. We then selected the sources that could be linked to the emission of an AGN. Thus, we cross-matched our catalogue with the Million Quasar Catalog 1 (MQC, v7.2, [48]). It lists published type-I QSOs/AGN, quasar candidates, type-II object and blazars along with the best available redshift values for each of them, i.e., spectroscopic or photometric redshifts. For the HETDEX Spring Field, 32,365 objects have been identified, in different surveys, as AGN.
That means that 0.48% of the detected CatWISE2020 sources have been identified as AGN, and close to 8% (2674) of them are considered as QSO candidates. From the identified AGN in our sample, 26,520 sources are listed in the Sloan Sky Digital Survey Quasar Catalogue DR16 (SDSS-Q DR16 [49]), implying that the mean properties of the objects studied in this work are driven by the behaviour of SDSS QSOs. In Figure 2, we show the distribution of the sources in the WISE colour-colour diagram and the histogram of available redshifts.  Redshifts for all radio-detected AGN are presented by the blue, vertically-hatched histogram. Confirmed AGN without high host influence (see main text) that show a measurement on the surveys used in this work, are presented in purple, clean histogram.
One important feature to note in Figure 2b is the distribution of radio-detected sources (i.e., sources which show a counterpart on either LOFAR, GMRT, or VLASS). It does not follow the same trends as non-radio AGN, i.e., there is not a peak around z = 2 and the number of sources increases towards z = 0. This behaviour appears from the inclusion of all AGN listed in the MQC. Its documentation 2 states that some sources with a strong influence from their host galaxies and QSO candidates are included along with confirmed core-dominated AGN. These objects exhibit, mostly, galaxy-like properties (with low radio emission levels, which might only be detected at redshift values close to z = 0), shifting their distributions to unusual shapes for AGN. In addition to that, and given that only around a 3% of the SDSS sources in our sample were detected in the radio bands when catalogued [49], and SDSS do not show this misshapen distribution, the distortion affects mostly radio-detected sources.
Even though this deviation from the expected distribution might affect, in some way, part of our results, we will keep the sources that produce it in our calculations. As mentioned in Section 1, part of the aims of this work is test whether an ML model can deliver reasonable results without discarding, or modifying, a large fraction of the intital dataset.
We also calculated colours for some of the bands. We computed g -r, r -i, i -z, z -y, g -i, W1 -W2, W2 -W3, W3 -W4, J -H, H -K, and FUV -NUV. In addition, following the results from Nakoneczny et al. [21], D'Isanto et al. [50], who studied different combinations of features and their positive impact on the prediction of redshifts, we have constructed ratios of magnitudes. The created quantities are r/z, i/y, W1/W3, W1/W4, W2/W4, J/K, and FUV/K. Finally, we included two indicators, in the form of a boolean flag, showing whether a source has a measurement on any radio band (LOFAR or TGSS) or on X-ray (Full band in XMM-Newton).

Data Preparation
Redshift values have a logarithmic behaviour when compared to the time passed-and distance travelled-between two values. A unit difference at low redshift has not the same significance as a unit difference at high redshift (i.e., early epochs). Given that, ultimately, redshifts can be used to determine distances, and times, from the observer to a given source, it is useful to make this quantity comparable to linear measurements. Thus, to overcome this non-linearity and, at the same time, establish a procedure to contrast predictions and real values, all comparisons will be normalised by the real redshift as follows: Using these two quantities, ∆z and ∆z N , it is possible to define a set of metrics to assess the quality of the prediction the developed models can achieve. First, we can define the standard deviation between the true, original redshift and the predicted value.
In the same way, the value ∆z N can be used instead of ∆z, giving rise to the normalised standard deviation, σ N ∆z . Alternatively, the redshift deviations can be used directly to create the median absolute deviation (MAD), or the normalised MAD (NMAD) with the weighted redshift deviations, ∆z N . Another quantity used to evaluate the predictions is the fraction of outliers, η. It represents the number of predictions that are too far away from the true value over the total number of prediction. There are several ways to define this value [51][52][53][54]. We will make use of the interpretation by Hildebrandt et al. [51], which considers all predictions that fulfil the following condition to be outliers: Using both the standard and normalised differences between redshift values can allow us to analyse the results of our predictions from two points of view: from a purely statistical standpoint (using the standard difference), and a physically-motivated perspective, with the use of the normalised redshift difference. Both approaches can be useful to reach a better understanding of the behaviour of the used models.
For this work, we have analysed our data using the Regression module of the Python package PyCaret 3 [55]. It can create a full pipeline for the use of our dataset and has enough options to change its parameters as needed.
The first step of data preparation is imputation. A large fraction of ML models cannot be used with missing data. For this reason, several methods have been devised to impute missing values (for a review on data imputation, see, for instance, Reference [56]). In our dataset, several features have a large fraction of empty spaces. A distribution of empty entries, prepared with the software missingno [57], can be seen in Figure 3. It is possible to see that radio and X-ray features have the largest number of empty values.
We imputed each magnitude with its detection limit and propagated those values for colours and ratios, assuming that empty entries are faint enough to be detected by each instrument. Thereafter, and within the PyCaret frame, we further removed features based on their influence over the prediction. We applied the Boruta method [58], discarding a feature if it behaves better than an aleatory version of itself. The final list of used features is seen in Table 5. The remaining features are re-scaled to have a mean value of µ = 0 and a standard deviation of σ = 1 and, afterwards, power-transformed to resemble a Gaussian distribution using the Yeo-Johnson method [59]. The use of re-scaling steps helped our models to improve their results over training with the original features. No further modifications were applied to the data. Thus, no corrections are applied for obscuration, AGN variability, host galaxy morphology, or other properties.
For the validation of our ML model, we have set aside a 10% of the full dataset. From the remaining 90%, 70% was used for training, and 30% for model testing. The same distribution of sources, which was created randomly, was used throughout the full study. Following the conventions used by PyCaret, the validation sub-set is the only fraction of the data which is not used for the training stage.

Model Selection and Stacking
With the help of PyCaret, we run simple realisations of a list of known ML model and selected, as meta-learner, the model with the best score (σ N z , see Section 2.2.1). After these tests, we stacked the four models with the following best metrics. Model stacking takes the results (predictions) from several models and adds them as new features for the meta-learner. In this way, the meta-learner can use the properties and advantages of the remaining models as a guidance for its own training and improve the prediction results. Furthermore, stacking can help improving the overall scores of the predictions. The stacked model was trained using 10-fold Cross Validation. The metrics of the training of the base and meta learners, along with those from the stacked model are presented in Table 2.

Redshift Prediction
For the model stacking, we have chosen, as base models, Extra Trees (Extremely Randomised Trees, [60]), CatBoost [61,62], LightGBM [63], and XGBoost [64]. A Random Forest regression model [65] was used as meta-learner. From the 10-fold Cross Validation training, we have obtained a value of σ N MAD = 0.0971 ± 0.0027 (see Table 2, where the uncertainty value corresponds to the standard deviation of the Cross Validation instances). This is in the order of a one hundredth of a scaled redshift unit, improving upon the results from the individual models. In addition, when including the test set in the training of the model, the normalised standard deviation is σ N MAD = 0.1000. Figure 4a shows the prediction values for the validation set. The density of the plotted points, with higher values shown as a darker colours, shows that a large fraction of the predictions are close to the y = x line. Additionally, the outlier fraction (Equation (4)) for the HETDEX Spring Field validation sample is η = 21.87%. The results of the prediction over the test and validation sets are summarised in Table 3.

Prediction in Stripe 82 Field
To avoid possible biases derived from predicting on the same type of data as that used in training, and to test the prediction capabilities of our model, we applied it in data from a different area of the sky. In this case, we selected the SDSS Stripe 82 Field. We gathered the same data as described in Section 2.1. The main difference is that this field is not covered by the LoTSS-DR1 Survey. Thus, the selected area is defined by the coverage of the VLA SDSS Stripe 82 Survey [67]. This is to mimic the use, as with the HETDEX Spring Field, of an area covered by a radio survey. The VLA-Stripe 82 Survey covers an area over 92 deg 2 with a median rms noise of 52 µJ/beam and an angular resolution of 1.8 . We have selected this field because of the high-quality measurements it hosts, and thorough studies on AGN over its area. The sample we have produced has 369,093 detected sources and 2941 of them have been labelled as AGN by the MQC. Additionally, 111 sources have been defined as QSO candidates.
In Figure 4b, the results of the redshift predictions, along with the original values, for Stripe 82 are presented. Results from Stripe 82, shown in Table 3, resemble those of the HETDEX Spring Field, hinting the possibility of, as long as the needed wavebands are available, using the trained models in areas of the sky which are not related to the training sample.
In addition, and even though all metric results are better in the initial HETDEX Spring sample, differences with Stripe 82 are on the range of 7-8%. These deviations are small enough to be caused by statistical variations among both fields. In the case of the outlier fraction, it is around 30%, and 8 percentage points higher than with the primary sample.

Previous Results
As a way to assess our results, it is possible to compare them to previous redshift determinations. This is the case of Ananna et al. [66]. They used multiwavelength data from 5961 X-ray-detected AGN in the Stripe 82 Field with z ≤ 3.0 and, from fitting SED models, they computed photometric redshifts. From their Table 7, a value of σ N MAD = 0.0602 is quoted for their full sample, which is in line with our prediction in the Stripe 82 Field (σ N MAD = 0.1197). In addition, an outlier fraction of 13.69% is achieved, less than half of what is obtained using our stacked model in the same area. It is possible to select, from our sample, the sources with a counterpart in the Ananna et al. [66] sample and apply our model to them. Using a matching radius of 2 , 221 sources are selected, reaching values of σ N MAD = 0.1122 and η = 0.2429. If we do the same exercise, selecting the results from the SED-fitting redshift determination, their values are σ N MAD = 0.0648 and η = 0.2048. Full results for this sub-sample are shown in Table 4.
To contrast our results with previous ML implementations, we can take the work from Curran et al. [68], who compared the results of applying deep learning, decision trees, and k-nearest neighbours regression to predict redshift values for 100, 000 SDSS DR12 QSO with accurate spectroscopic redshifts. Results are presented in Table 4. · · · · · · · · · When comparing our results (Table 3) with the outputs from Curran et al. [68], we note that the metrics for our Validation set are 20-40% higher and those from the Stripe 82 Field, 40-60% higher than theirs. This is a consequence of our decision of not cleaning our training set, mimicking the conditions a large dataset might present. They, in contrast, have trained their models with sources that have full coverage on the bands they selected, avoiding the use of imputation. Moreover, since they have used large SDSS sample, the properties of QSO among them are more homogeneous than that of the present work, leading to improved prediction results.
Comparison with previous works, using traditional template-based and ML photometric redshift determination methods, highlights the prospective scenarios to apply our model. Rather than selecting a very small area with the right conditions, we can use the model here presented on large regions with incomplete coverage, rising the likelihood of obtaining objects with specific resdhift values.

Feature Importances
Feature importances from our model are listed in Table 5. The values are provided by the model itself, and they have been calculated as the mean decrease on impurity for the ensemble of trees. We can see that the features with the highest importances are those coming from the CatWISE catalogue. After them, quantities are derived from Pan-STARRS. In addition, finally, those obtained from AllWISE, and GALEX observations suggest a very low impact in the model training and the predictions derived from it. Entries from CatWISE have the largest amount of relevant, non-repetitive information from all features. Despite the different nature of the used features, i.e., magnitudes, colours, ratios, there is no clear preference of one kind over the others. The main factor to have high importance is the fraction of sources with a measurement in the studied feature. This distribution also reinforces the results from Reference [50], who established that is possible to use combinations of magnitudes other than colours and train, successfully, ML models. Table 5 also gives information on the features that can be discarded from the model training without having a high influence on the predicted values (features with data from 2MASS and GALEX). Finally, it is important to stress that, in this work, we have not discarded data based upon the feature importances.

Shapley Explanations
Shapley values were obtained using the Tree-based module of the Python package SHAP 4 [69,70]. In Figure 5, features are sorted by decreasing median Shapley values.
The quantity with the highest Shapley value is related to the base observations. However, from the distribution of values in the horizontal axis for the W1 magnitude, it is possible to see that its large dispersion implies that its influence on predictions can drive the final redshift either to low or high values. This is in contrast with, for instance, the g -i colour. Its Shapley values might be close to zero, indicating that it does not have impact on the redshfit prediction. The values can be higher than zero, as well, driving the predictions to high redshift values.  Most of the remaining features show Shapley values clustered around 0.0, and a small sub-sample deviates from this and has a noteworthy influence on predictions.
The feature with the second highest median Shapley values is the NUV magnitude from GALEX. From Figure 3, it is possible to see that this feature exhibits a very high fraction of empty entries. That implies that most of sources have an imputed NUV magnitude. This distribution is present in Figure 5. Therefore, all imputed magnitudes make the redshift prediction go up, and all measured magnitudes make it go down. Although this behaviour might seem anomalous, it has its roots on the fact that very few high-redshift sources are detectable by GALEX.
Being able to retrieve these interpretations is one of the advantages of using Shapley values from a prediction model. It is possible to understand whether certain range of values of a feature can make a prediction go up or down. This differs from feature importances, which allow an average view of the impact of a feature over the complete trained model. Despite their differences, feature importances and Shapley values can help understand the impact that measurements in different wavelengths can have over the understanding and prediction of redshift values of AGN. In particular, and given the relevance and highquality observations that future radio surveys and observatories will deliver, adding direct measurements (e.g., Reference [71]) or features derived from them might be highly beneficial when focusing the search on high-redshift objects. The latter might be the case with alreadyknown quantities, such as radio loudness or radio spectral indices. These properties can provide indications on the radio emission [72] and its relation with other wavelengths [73,74].

Conclusions
In this work, we trained several Machine Learning models to predict, from a sample of infrared-detected AGN-and their multiwavelength counterparts-their redshift value.
Sources were obtained from CatWISE2020 catalogue and counterpart measurements were obtained from AllWISE, Pan-STARRS, LOFAR, GMRT, VLASS, GALEX, 2MASS, and XMM-NEWTON observations and surveys. All of the sources are located at the HETDEX Spring Field.
Using of the PyCaret Python package as a framework, we stacked four different models with a meta-learner. The application of model to the validation set lead a median redshift error on the prediction of σ N z = 0.1986 and an outlier fraction of η = 21.87%. This goes in line with previous results, taking into account that no major cleaning procedure was performed into the dataset.
To further test the power of our model, we applied it to a separate catalogue of AGN located in the Stripe 82 Field, and the median redshift error was σ N z = 0.2501 and an outlier fraction of η = 29.72%.
To understand the influence of the different features included in the model, Shapley values were calculated for the training sub-set. The features from WISE and from Pan-STARRS show the highest median Shapley values, mirroring the fact that these features have the lowest number of imputed entries.
The results presented in this work stress the benefits of using ML as an initial approach to derive redshift predictions for AGN. Using a fraction of the time a template-based photometric redshift determination tool might take, ML can give redshift predictions with a high confidence level which can lead to further studies of selected sources. This advantage might become critical to the use of current and future large-area surveys-with radio surveys being a major example, which need to extract information from several millions of sources within an appropriate amount of time.
Even though some of the results obtained in this work do not show a considerable improvement from previous studies, it is relevant to emphasise that our work was aimed to extract predictions using datasets without large amounts of preparation, i.e., feature engineering. This implies that it is possible to use a very heterogeneous group of datasets (with different sensitivities, resolutions, etc.) and obtain useful predictions from them without the need of cleaning and reducing the number of used sources in each catalogue.
Our model can be further improved using future surveys which will cover large areas with very deep observations. One such survey is Data Release 2 of the LoTSS survey, which will be released in the near future. It will cover 5720 deg 2 in the northern sky with similar sensitivities as DR1 [75]. If assuming the same AGN density as in LoTSS DR1 (see Section 2.1, with 32,365 AGN in 424 deg 2 ), DR2 is expected to deliver 436,622 AGN from its area. This will allow us training a redshift prediction model with a number of sources one order of magnitude larger, improving its accuracy dramatically by capturing the properties from a larger parameter space. This improvement can be also analysed in terms of cosmic variance. Following the results by Reference [76], DR1 from the LoTSS survey will be subject to a cosmic variance between 10 and 20%. In addition, extrapolating the curve from their Figure 6, DR2 will make this value go below 10%. Only from this improvement, we might expect to achieve a better training for a prediction model. AGN might present variability on their observations with different timescales [77,78], which might impact the observed properties of the used datasets. These variations can increase the fraction of outliers in different ranges [79].
Additional sources of improvement in the results are related to the treatment of the missing values in our catalogue. Devising more advanced imputation methods, which can take into account the distribution of measured values in one feature and their relation to the rest of features, might refine our results. Related to this, some features have a low fraction of measured values, adding little information to the models. Discarding these features also might reduce the fraction of outliers. Apart from the data treatment, further improvements might be achieved if the intrinsic time variability of AGN is taken into account.
The used model might arrive to better results creating several instances of data sub-sets. Using different combinations of sources for training, test, and validation might have an impact on how the model arrives to separate predictions.
With all these advantages, the model described in this article can be used as part of a full pipeline which might be able to predict the presence of AGN in a large-area field. In addition, for the predicted AGN, we predict their redshift values, among other properties, e.g., radio detectability. This might allow the creation of catalogues with high-redshift Radio Galaxies from datasets covering large areas. Data Availability Statement: Not applicable and data storage facilities in several countries, which are owned by various parties (each with their own funding sources), and which are collectively operated by the ILT foundation under a joint scientific policy. The efforts of the LSKSP have benefited from funding from the European Research Council, NOVA, NWO, CNRS-INSU, the SURF Co-operative, the UK Science and Technology Funding Council, and the Jülich Supercomputing Centre. The Pan-STARRS1 Surveys (PS1) and the PS1 public science archive have been made possible through contributions by the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max-Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg and the Max Planck Institute for Extraterrestrial Physics, Garching, The Johns Hopkins University, Durham University, the University of Edinburgh, the Queen's University Belfast, the Harvard-Smithsonian Center for Astrophysics, the Las Cumbres Observatory Global Telescope Network Incorporated, the National Central University of Taiwan, the Space Telescope Science Institute, the National Aeronautics and Space Administration under Grant No. NNX08AR22G issued through the Planetary Science Division of the NASA Science Mission Directorate, the National Science Foundation Grant No. AST-1238877, the University of Maryland, Eötvös Loránd University (ELTE), the Los Alamos National Laboratory, and the Gordon and Betty Moore Foundation. This research has made use of data obtained from the 4XMM XMM-Newton serendipitous source catalogue compiled by the 10 institutes of the XMM-Newton Survey Science Centre selected by ESA. This publication makes use of data products from the Two Micron All Sky Survey, which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center/California Institute of Technology, funded by the National Aeronautics and Space Administration and the National Science Foundation. This research has made use of the VizieR catalogue access tool, CDS, Strasbourg, France (DOI : 10.26093/cds/vizier). The original description of the VizieR service was published in Reference [81]. This research made use of Astropy 6 , a community-developed core Python package for Astronomy [82,83] and TOPCAT 7 [84]. This research has made use of NASA's Astrophysics Data System.

Conflicts of Interest:
The authors declare no conflict of interest.