Estimating the mass of galactic components using machine learning algorithms

The estimation of the bulge and disk massses, the main baryonic components of a galaxy, can be performed using various approaches, but their implementation tend to be challenging as they often rely on strong assumptions about either the baryon dynamics or the dark matter model. In this work, we present an alternative method for predicting the masses of galactic components, including the disk, bulge, stellar and total mass, using a set of machine learning algorithms: KNN-neighbours (KNN), Linear Regression (LR), Random Forest (RF) and Neural Network (NN). The rest-frame absolute magnitudes in the ugriz-photometric system were selected as input features, and the training was performed using a sample of spiral galaxies hosting a bulge from Guo's mock catalogue \citep{Guo-Catalog} derived from the Millennium simulation. In general, all the algorithms provide good predictions for the galaxy's mass components ranging from $10^9\,M_\odot$ to $10^{11}\,M_\odot$, corresponding to the central region of the training mass domain; however, the NN give rise to the most precise predictions in comparison to other methods. Additionally, to test the performance of the NN architecture, we used a sample of observed galaxies from the SDSS survey whose mass components are known. We found that the NN can predict the luminous masses of disk-dominant galaxies within the same range of magnitudes that for the synthetic sample up to a $99\%$ level of confidence, while mass components of galaxies hosting larger bulges are well predicted up to $95\%$ level of confidence. The NN algorithm can also bring up scaling relations between masses of different components and magnitudes.


Introduction
The bulge-disk decomposition of galactic systems is useful for understanding the evolutionary processes of galaxies and their dynamics.Specifically, the disk and bulge masses can be inferred, given that their stellar population has different dynamic or even chemical features.There are plenty of schemes for classifying galaxies; one of the most popular corresponds to the morphological classification proposed by Edwin Hubble (Hubble, 1929) consisting of four types of galaxies: elliptical, spiral, barred spiral, and irregular.Another method involves the isophotal radius measurement (Holmberg, 1958), determining the size attributed to a galaxy component according to a particular surface brightness level.A way to characterize the light distribution independent of the light profile is through the concentration measure, defined by the ratio of two geometrical regions, each containing a fixed fraction of the total galaxy luminosity (Kent, 1985).
Another approach for reconstructing the visible mass of galactic components involves using standardized fitting functions.Ideally, these functions should be derived from the fundamental principles governing galactic evolution.However, due to the intricate nature of the physics involved, models based on these principles often become complex, with a substantial number of parameters.Then, commonly used functions are empirically derived.For instance, the disk components are well-fitted by an exponential law, while for the elliptical galaxies and the bulges in the spiral ones, the relations typically considered are the King's model (King, 1966) and the de Vaucouleurs's law (de Vaucouleurs, 1948).Sometimes, the bulges associated with late-type galaxies are best fitted by exponential laws (Andredakis et al., 1995;Freeman, 1970).However, implementing these methods demands highquality observational data to obtain reliable results.
Furthermore, it is possible to single out the galactic components through the light distribution of a galaxy.This decomposition is derived by fitting the light profile to a power law, adhering to a specific empirical or analytical model.In the conventional photometric technique, the one-dimensional case is considered.On the other hand, when multiple wavelengths in the spectrum are taken, spectroscopic methods come into play (Johnston et al., 2012).
On the other side, numerical simulations play an important role in exploring predictions of galaxy evolution within the standard ΛCDM prescription (Croton et al., 2006;Guo et al., 2010;De Lucia et al., 2004).Semi-analytical models have gained popularity when identifying the structural components of galactic systems.These models employ a simplified representation of baryonic physics, coupled with Markov-Chain-Monte-Carlo methods for reconstructing merger trees (Lucia, 2019).
For dark matter-only simulations a common technique to infer information about the baryonic components is to assume the halo-abundance matching (HAM), which relates the halo potential well to the star formation rate in such a way that more luminous galaxies are associated to more massive halos.During the evolution of both components, material exchange occurs between the baryonic elements through various processes.For example, forming a galactic bulge may result from major or minor mergers (Hopkins et al., 2010).In these processes, pre-existing and newly formed stars play a crucial role; after a merger, all stars from the progenitors contribute to the bulge component of the resulting galaxy.The gas within the progenitors becomes part of the resulting galaxy disk, and the specific angular momentum of this component equals that of the halo in which it is embedded (Guo et al., 2011;Bower et al., 2006;De Lucia & Blaizot, 2007).
As can be seen, various approaches exist for describing galactic components, including pure morphological observations or photometric and/or spectroscopic techniques, either synthetically through mock catalogues.Conversely, obtaining information about the total mass often involves making strong assumptions, whether about a specific dark matter model or the overall kinematics of the system.In this work, we propose an artificial intelligence (AI)-based method to isolate the bulge and disk components of both baryonic and total galaxy mass.This is accomplished using the information on luminosity and features inferred from stellar dynamics encrypted in Guo's synthetic catalogue (Guo et al., 2011).Our goal is to perform the decomposition of the galactic components without including additional information about baryons in the training stage beyond the patterns the AI methods can infer from the catalogue.This method can be useful to obtain the components of observed galaxies whose baryonic dynamics cannot be obtained easily.
Since the mass values range between several orders of magnitude, it is well suited to predict the logarithm base 10 of the stellar mass M⋆, the disk mass M disk and the total mass Mtot.Here, the bulge mass M bulge is not within the prediction set since we compute it in terms of disk mass M disk and the total mass M⋆ from the following expression (Guo et al., 2011;Lucia, 2019) (1) It is worth stressing that with our method, we aim to predict mass components in spirals following some theoretical assumptions about the features of dark matter.Specifically, using a mock catalogue as a training set of our AI algorithms, we set dark matter in a CDM prescription and make inferences about mass components holding such a hypothesis.Our algorithms were tested using observational data reported in the SDSS database (Abdurro'uf et al., 2022) to assess how well the predictions match observations, keeping the features set as simple as possible.We are interested in knowing the advantages and disadvantages of the algorithms to predict galaxies with different features.This paper is structured as follows: Section 2 presents and analyses the content of Guo's galaxy catalogue to determine the correlation between input features and output predictions, emphasizing their importance during the training stage.Following this, Section 3 introduces the machine learning algorithms considered in this work and explores their dependency on variations in different parameters.Subsequently, in Section 4.1, we analyze the performance of each algorithm.Then, in Section 4, we apply the trained methods to predict masses of components in observed galaxies from the Sloan Digital Sky Survey (Abdurro'uf et al., 2022) database.This allows us to derive various scaling relations commonly studied in the literature.Finally, we draw some conclusions in section 6.

The Data
To train our machine learning algorithms, we have used Guo's galaxy catalogue (Guo et al., 2011) derived from the Millennium simulation, selecting only galaxies with nonzero bulges or disc components, leading to a set of 833,491 galaxies.This dataset was split randomly, assigning a common selection where 75% of total data is defined for training and 25% to evaluate the performance of the algorithms.The Millenium simulation is a dark matter-only, carried out under the ΛCDM prescription (Springel et al., 2006) using a customized version of the Gadget 2 code (Springel et al., 2005) with 2160 3 particles within a box of L = 500 Mpc/h.This catalogue provides information about the merger history of each halo and the baryon content, which has been split into five components: the stellar bulge, the stellar disk, a gas disk, a halo and an ejecta reservoir (Guo et al., 2010).
The analytical model implemented in Guo's catalogue considers that galaxies form within the central region of dark matter halos.The fitting function, which describes the average baryon fraction of a halo given the total mass, can be written in terms of its mass and redshifts (Gnedin, 2000) where the universal baryon fraction is usually taken as Here, Mc represents the characteristic mass objects which can retain 50% of the gas components to form stars.The reionization and cooling depend on the baryon fraction in a given halo and its mass and redshift.The disk and bulge formation are correlated with star formation and supernova feedback processes, as well as with the black hole growth and AGN feedback.Additionally, mergers between the central and satellite galaxies are described through simulations and play an important role in the disk and bulge evolution.This catalogue accurately reproduces the population and clustering mechanisms observed at z ∼ 0. However, it exhibits inconsistencies for high-redshift populations.
In this work, we consider galaxies at z = 0. Our goal was to investigate spiral galaxies hosting both bulges and disks.Then, we imposed this strong filter when selecting our sample from the mock catalogue.It is crucial to note that our selection encompasses diverse galaxy types without accounting for age or metallicity.The purpose is to explore the capabilities of the algorithms to get information about the systems by exclusively using photometric information.
The resolution of the simulation delimits the range of masses for each component.Once the selection of bulge-disk galaxies has been performed, the range of the total mass is between 10 10 M⊙/h and 10 13 M⊙/h.Notably, the selected total mass range excludes galaxies of both low and high masses.Indeed, massive galaxies tend to exhibit an elliptical morphology rather than spiral (De Lucia et al., 2006).

Features importance
It is well known that the physical and photometric properties of the stellar population of a galaxy are closely related to its dynamics and the spatial mass distribution of different components within the system.Specifically, this behaviour is reflected in the colour-magnitude relation.For instance, it has been shown that bulge-dominant galaxies have a color-magnitude diagram mainly described by red galaxies (Hogg et al., 2004).Besides, in Barsanti et al. (2021), it has been shown that the bulge is redder than the disk in galaxies within a cluster.A similar conclusion was reported in Dimauro et al. (2018).
In machine learning, the training data is defined as the feature vector X and their corresponding label or associated output y, with unknown distribution P(X, y), as follows where R d denotes the d−dimensional feature space, I the label space and n is the sample size.In this work, we consider two sets of features, the first corresponding to the u, g, r, i, and z absolute magnitudes that we dub hereafter as Set I.Such magnitudes are also available in the SDSS dataset; therefore, there is an observational counterpart.Within a second set (Set II), the same features as Set I are considered in addition to the maximum rotational velocity of the halo, Vmax, to include information about the kinematics.In both cases, the predictions (labels) are M disk , M⋆, and Mtot as it is displayed in Table 2.1.An exploration of the data was conducted using Pearson's correlation ratio, where barred symbols represent the mean values and SX,Y is the standard deviation.When this quotient is rX,Y = ±1, we have a perfect positive (negative) correlation, whereas for rX,Y = 0, parameters are not correlated at all.In Figure 1, the correlation matrix illustrates how the features contribute to the algorithm's predictions.The matrix displays the absolute values of Pearson's correlation ratio, focusing solely on the strength of the correlation parameter.As anticipated, M⋆ exhibits a high correlation with the magnitudes, particularly with the z and i bands, corresponding to the infrared and near-infrared regions of the spectrum, respectively.Observationally, the determination of luminous mass is significantly influenced by dust, with emissions in the optical band experiencing reddening.Conversely, this effect is negligible in the near-infrared (Tully et al., 1998).On the other hand, M disk shows less correlation with the magnitudes compared to M⋆ and exhibits a weak relation with the remaining quantities.Furthermore, Mtot exhibits a strong correlation with M⋆ due to the HAM relation implemented in the mock catalogues.The correlation between Mtot and Vmax, which encapsulates information about Table 1: Input and outputs features considered for the ML algorithms in this work.The Set I corresponds to the photometric information derived from Guo's catalogue using semi-analytical models.The Set II includes information about the dynamics of all compoments.
Figure 1: Heat map of the absolute value of the Pearson correlation coefficient between the galaxy parameters considered in this work.The redder it is, the higher the correlation.As expected, stronger correlations occur between different bands' stellar mass and magnitudes.However, a relation exists between the total mass and the magnitudes, although to a lesser extent.
the dynamics of all components, surpasses that of other masses.Notably, the correlations with M disk are weak, suggesting that the dark matter component influences the stellar component as a whole rather than each component individually.

Implementation of the machine learning algorithms
We employed a set of widely used supervised algorithms known for their effective predictions, listed as follows.These methods were implemented using the scikit-learn library (Pedregosa et al., 2011;Buitinck et al., 2013) and the Keras API (Chollet et al., 2015).
• KN-Neighbours (KNN).This algorithm relies on the idea that the set of k nearest data points Cx ⊂ D, where |Cx| = k, have similar values.The neighbours are defined such that This distance is defined in the hyperspace of features using the Euclidian metric, and the final value is the average of their outputs.In this case, the number of neighbours is a free parameter, and we found that the highest accuracy is achieved when the number of neighbours is close to 18; the error starts to increase beyond that value.
• Linear Regression (LR).The traditional linear regression minimizes the sum of the squared differences between the predicted and actual values.We are considering this method to compare it with more sophisticated techniques.
• Random Forest (RF).This algorithm is subject to the number of trees and their depth.Each tree contains decision nodes Nm that split the data (X node , y node ) (in the parent node) into smaller (left and right) subsets in new child nodes C L m and C R m until the branch finds a homogeneous group according to the set of hyperparameters.Splitting each node in regression is made following the minimization of the residual Rm defined as where ȳL m and ȳR m are the mean of the target values in the child nodes.The split is performed if the minimum Rm is below a defined threshold.Because of how the trees are built, it is easy to overfit.Therefore, it is strongly recommended to use a set of trees instead.We used nearly 150 trees for the training.
• Neural Network (NN).NN is an interconnected group of nodes stored in a layer and connected to other nodes in the network by unidirectional connections of different weights.Patterns learned in a layer are transferred to the next activated nodes.We implement the early stoppingbased method as a regularization technique to avoid overfitting, stopping the training once the performance no longer improves.This is measured by the loss function, which quantifies the discrepancy between predicted error and true values.For a regression, it can be taken as the squared loss function here h(X) is the function that minimizes the loss associated with the target value of the i−th class, h = argminL(h).A common assumption is to take h(x) = B T Xi + b, where B are considered the weights coefficients and b a constant.In this case, we also considered the Lasso regularization method.This technique penalises the model's coefficients, shrinking or setting them directly to zero, giving rise to a sparse model.Then, the eq.( 7) is transformed into where the last term is subject to p j=1 |Bj| < c.The best NN architecture was also obtained by varying the model's hyperparameters, such as hidden layers between 1 and 3, the number of neurons between 32 and 512 per layer and adjusting the learning rate across values of 10 −2 , 10 −3 , and 10 −4 .The best configuration has three hidden layers with 256, 224, and 352 neurons, respectively and a learning rate of 10 −4 .
4 Testing the algorithms performance

Relative percentage difference
In Figure 2, we present the relative percentage difference between the logarithm of the actual mass, M actual in the mock catalogue and the logarithm of the mass predicted by each algorithm, M pred .The algorithm dispersion is estimated by using the parameter ∆ (Calderon & Berlind, 2019), which can be computed as follows The results were plotted into bins for which the mean value is shown in dashed lines, whereas the standard deviation corresponds to the width of the shaded regions around the mean value µ ± σ.The left panel shows the result when the training was carried out using Set I, while the right side corresponds to Set II.
The uncertainty bands in the histogram noticeably narrow as data counts increase, indicating a more accurate prediction.The highest errors for M disk and M⋆ predictions (Figs. 2 (a) and 2 (c)) lie below 10 9 M⊙/h and arise due to the low amount of data.In contrast, for Mtot in Figs. 2 (e) and 2 (f), the error increases for larger mass values, signifying reliable predictions in the central region around 10 11 − 10 12 , M⊙. Notably, the distribution of Mtot is narrower compared to M disk , as depicted in Figs. 2 (a) and 2 (e).This can be because the sample of galaxies chosen from the mock catalogue satisfies the condition of having a bulge, a criterion fulfilled only by sufficiently massive galaxies.
Most of the predictions exhibit statistical errors centred around zero.Fig. 2 (c) and Panel (d), M⋆ displays the smallest percentage difference in both cases, owing to a linear correlation between magnitudes and luminous mass (Reiprich & Boehringer, 2002;Kuiper, 1938;Liebert & Probst, 1987).The LR model shows the best score since it was trained by directly fitting a scaling relation.In some algorithms like NN and RF, the error increases around 1% for masses 10 10 M⊙/h in Set II.  Figure 3: Histograms for the ugriz-photometric system magnitudes for both simulated and observational data.Vertical dashed lines show the mean value of each distribution.These histograms clearly show that within the mock sample, the distribution of magnitudes for galaxies significantly differs from that for the SDSS sample.This suggests that the algorithms will explore the SDSS sample and a different combination of the features from the training set.
In the context of disk mass, as shown in Figs. 2 (a) and 2 (b), the percentage difference is higher compared to the M⋆ case.Nevertheless, it remains within an acceptable prediction range for medium and high masses.Interestingly, predictions for both sets of features exhibit similar trends.In contrast to the linear fit used for M⋆, the LR method is no longer the optimal choice due to the nonlinear nature of this relationship.Instead, the NN and RF algorithms demonstrate superior training performance for Set I.
Finally, for Mtot, fig.2(e) shows that Set I only gives unbiased predictions within the range 10.7 < log Mtot < 12, while for Set II, fig.2(f), this is true in the range 10.7 < log Mtot < 12.7.This makes sense physically as Vmax should be more sensitive for probing higher mass halos above Mtot = 10 12 .The correlation between the magnitudes in Set I and Mtot is not straightforward.However, since mock catalogues follow the HAM relation, a correlation exists between Mtot and M⋆, consequently influencing the magnitudes.This correlation contributes to achieving favourable results in predicting the total mass.In this context, NN yields the best performance for Set I, given the absence of an explicit scale relation, while for Set II, all predictions are similar.
After analyzing the performance of predictions for Set I and Set II, we concluded that the latter does not significantly improve the results.As mentioned, the main enhancement is observed for Mtot.Additionally, having information about the Vmax for galaxies can be challenging due to the system's dynamics.Therefore, in the interest of simplicity, we have opted Set I moving forward exclusively.
Up to this point, we have assessed the training performance using synthetic data.In this section, we will apply the trained NN to predict masses of different components in real galaxies from the SDSS survey (Abdurro'uf et al., 2022).It is crucial to note that galaxies from the mock catalogue have specific limits for the ugriz-magnitudes, which are directly tied to the resolution of the simulations.This dependence arises from the halo masses and, consequently, stellar masses influencing the ability of the semi-analytical models to assign magnitudes in certain regions.In contrast, observed galaxies from SDSS exhibit limitations in the low surface brightness regime due to challenging observational features (Willman et al., 2002;Williams et al., 2016).
Fig. 3 shows the distribution for each magnitude for both SDSS and Guo's galaxy catalogue.As previously mentioned, observed galaxies exhibit high luminosity, causing a shift in the mean value of each magnitude compared to synthetic galaxies.Since both samples do not fall within the same ranges, we will focus on regions where we have information about observations and simulations.Indeed, the literature has reported that NN behaves as interpolators (Saxe et al., 2019;Spigler et al., 2018).Therefore, the sample of observed galaxies to be assessed by the algorithm should have input features within the same domain of the training and test mock datasets.
We selected a galaxy catalogue from the SDSS database, with information about 660,000 galaxies and their morphological components (Mendel et al., 2014).The masses listed there were determined by fitting a broadband spectral energy distribution.This process involved making assumptions about the initial mass function, extinction law and stellar evolution.In that catalogue, the bulge-disk brightness profiles were reconstructed using the photometric decomposition method with the Sersic profile where Re is the half-light radius and Ie the intensity at that radius.Here n is known as the Sérsic index and control the curvature of the profile.The magnitudes used in the predictions stage were obtained from the SDSS DR7 (Abdurro'uf et al., 2022).We converted the apparent magnitude (m) to absolute magnitude (M ) using (Schneider, 2006) where d is the distance to the source, in units of 10 parsecs.Distances were computed using the python library Astropy (The Astropy Collaboration et al., 2013) with the redshift reported in NED1 and assuming the cosmological parameters from Planck 2018 (Planck Collaboration et al., 2020) H0 = 67.66km/Mpc/s, and Ωm0 = 0.26.Our analysis focused on about 70% of the total dataset, concentrating on galaxies with the u, g, r, i, and z magnitudes.
A valuable piece of information for describing the evolution and structure of galaxies are the scaling relations between physical quantities of a galaxy sample.We analyse the scaling relation between mass components and the r-magnitude, as it is reported in other works, (Mahajan et al., 2017a;Venhola, Aku et al., 2019;Mahajan et al., 2017b;Côté et al., 2015).This is best correlated with stellar mass among SDSS filters.The relations for magnitudes in other colours are similar.We also study the M bulge − M disk relation as well as the M⋆ − Mtot.We only employ the NN algorithm to obtain the results presented in this section, given that it performs better with fewer errors and its construction involves a more complete architecture than the other AI algorithms.

Mass-magnitude relation
In Fig. 4, we present distributions projected onto the M⋆− rmag, M disk -rmag planes, and the M bulgermag relation for completeness.In each case, distributions up to 2σ for three datasets are shown: firstly, from the original mock catalogue in blue; from the original SDSS catalogue in green; and the third corresponds to NN predictions for the SDSS galaxies in red.The contours represent the 99% and 95% confidence levels.For plotting these figures we are using the whole data of spiral galaxies in the mock catalogue nevertheless, the masses reported in the observed catalogue fall within the regions depicted in Fig. 2. First,fig. 4,Panel (a) illustrates the scaling relation between M⋆ −rmag.The NN predictions agree with the real values up to 95% C.L.However, as we approach more massive galaxies, and consequently, the resolution limit for simulations increases the error.Overall, the NN exhibits accurate predictions for M⋆, consistent with Fig. 2 (c).Indeed, the best-fit slopes for each dataset only show slight For M disk in fig. 4 panel (b), we distinguish a possibly bimodal distribution with two regions for simulations laying inside the range of mass between 7 < log M disk < 9.There is a separation between both blobs due to the lack of information at rmag ∼ −19.There is an acceptable agreement within the 95% C.L for galaxies in the low-surface brightness region.
Here, it is worth to mention that in all cases the masses predicted by the NN fall within the region of the simulated masses, as expected.However, for M disk , we observe that the red 2-sigma curve go outside the the blue and green regions for masses below 10 9 M⊙/h and above 5 × 10 1 0M⊙/h.This is related to the fact that the output masses are distributed in a three dimensional space (disk-bulgestellar) and we are showing the projections over a single input parameter.
Bulge masses for most brilliant galaxies within the same mass range are not well predicted and are excluded by the NN architecture.This region corresponds to quasi-elliptical systems with large mass but small disks (see Fig. 4 (b)).In this case, the NN predicts that this type of system is unlikely, and in fact, it would be challenging to distinguish the disk from the bulge without an accurate numerical method.This conclusion is supported by panels (a) and (c), where the prediction aligns with the expected result for more than 95% of the data.However, the missing points in panel (b) are compensated by the excess points in panel (c).This suggests that purely elliptical systems provide a better description of these cases.This behaviour is also reflected in Fig. 2 (a), where the error increases for masses below 10 9 M⊙/h.
Additionally, the fact the neural network (NN) predicts well the stellar mass of SDSS galaxies (Figure 4 a)) serves as a consistency test between the mock catalogue and the NN.However, the predictions for small disk components deviate from the SDSS catalogue, which can suggest that the features are insufficient for training the NN or that the catalogue needs further precise information about the components.The values for M bulge in 4 panel (c) are derived from eq. ( 1) and from values of M⋆ and M disk directly inferred by the NN.We can observe an acceptable agreement between observations and simulations up to 95% C.L.

Bulge-disk components
The relation between the luminous mass and the bulge-disk masses is described by eq. ( 1).M⋆ can be determined by a scaling relation (see Fig. 4).Thus, for a specific value for M bulge , the M disk will only take values within certain intervals, and vice versa.
Figure 5 shows bulge and disk masses of galaxies within both datasets.The mock catalogue shows a trimodal distribution.The most prominent region, for M disk > 10 8 M ⊙/h , corresponds to low values for M bulge , and it is associated to disk-dominated galaxies.The second region, for M bulge > 10 10 M ⊙/h is the bulge-dominated region (Conselice, 2006).This sort of galaxies are usually dubbed as cD−like galaxies (central dominant) (Guo et al., 2011;Oemler Jr, 1976) This behaviour arises in both observed and simulated galaxies, although disk-dominant galaxies are more abundant in both cases.
The third region in the M bulge − M disk , which corresponds to galaxies with both small disks and bulges, are only shown for the mock data.This discrepancy suggests that there may be an observational bias because current telescopes might not be able to detect the low-luminosity galaxies that appear in the numerical simulations.
The NN prediction is also shown in Fig. 5.For this last sample, the relationship between disk and bulge is nonlinear and not readily fitted with an analytic function as it happens with scaling relations derived in section 5.1.This can be due to the multimodal distribution suggesting that different scaling relations between bulge-disk mass components might arise for different galaxies within the sample.Nevertheless, the machine learning algorithm can make good predictions for disk-dominant galaxies.Furthermore, it is interesting that the NN algorithm gives rise to mass predictions consistent with the SDSS distribution and does not predict bulge-dominant galaxies as expected.Giving more accurate information about larger bulges can involve more complicated dynamics.

Conclusions
It is well known that the bulge-disk decomposition and the estimation of the total mass of galactic systems are complicated tasks that have been tackled by considering several assumptions.This paper presents an alternative method to make such estimations based on AI algorithms designed to predict the masses of different components in spiral galaxies.Two sets of input features were considered in the first stage of our analysis.In the first set, the magnitudes in different bands were considered, while in the second one, Vmax was added to the first set to include information about the system's kinematics.After analyzing the performance of trained algorithms and testing the importance of the parameters, we figured out that the Vmax is only relevant for computing the galaxy's total mass.The previous suggests that absolute magnitudes of the galaxies provide sufficient information to predict the masses of galactic components.Therefore, these methods can be readily applied to estimate masses of observational galaxies from different datasets.
These methods were used to predict the masses of different components of real spiral galaxies within the SDSS catalogue.Throughout the whole SDSS sample of galaxies, values of magnitudes and masses hold higher values than those taken for our analysis.We chose a subsample holding parameters within the same ranges as our training mock catalogue.During the training and test stages, the NN not only provided the estimations for masses of different galaxy components but could predict scaling relations (mass-magnitude) achieving at least a 95% C.L. agreement with observational data.
From either observational or mock samples of galaxies, our analysis and algorithm are restricted to only be applicable within a small range for the total mass, given that only galaxies with nonzero bulges and disks were selected within the sample.We found that the NN algorithm shows good predictions for disk-dominant galaxies only considering the photometric information.Additionally, at 95% confidence level, the NN predicts that it is unlikely to have bulge-dominant galaxies, consistent with the lack of information from observations.This training can be improved using additional information about the galaxy's dynamics, age or chemical composition.However, using only the photometric information can be useful to obtain a sufficiently good estimation of the mass components in spiral galaxies with bulges.
We will continue this project by constructing machine learning algorithms trained with features inferred directly from observational data to have more accurate results and explore the HAM relation in real galaxies and the possible dependency on the morphology or age of the systems.

Figure 2 :
Figure 2: Relative percentage difference for the predictions of different Machine Learning algorithms concerning the actual values in the mock catalogues.Set I is displayed on the left, and Set II on the right.The histograms in the figures represent the distribution of the data.As expected, the predictions are better where the data density is higher.The lines represent the mean value µ, and the bands are one standard deviation from the mean value µ ± σ.

Figure 4 :
Figure 4: Kernel density estimation (KDE) plots of the stellar (a), disk (b), and bulge (c) masses components versus the r-magnitude for the simulated data in blue and the observational one in green.The red lines are the {95,99}% confidence level (C.L.) contours of the NN predictions.It can be noticed that the NN prediction is more accurate for the stellar mass and disk-dominant galaxies since the agreement is achieved up to 99% C.L.Even though the prediction for the bulge mass is less precise than for other components, the NN archives a good agreement up to 95% C.L.

Figure 5 :
Figure 5: KDE plots of the Bulge-Disk decomposition.The distribution for simulated (blue) and observed (green) data and the solid contour levels are shown.We observe a possible trimodal distribution for the mock catalogue.In contrast, the observations show a unimodal distribution similar to the predicted NN distribution for observational data (red contours).
differences.The best fits for the mock catalogue, SDSS, and NN predictions, respectively, are M⋆ = −0.427rmag+ 1