Translating neutron star observations to nuclear symmetry energy via artificial neural networks

One of the most significant challenges involved in efforts to understand the equation of state of dense neutron-rich matter is the uncertain density dependence of the nuclear symmetry energy. Because of its broad impact, pinning down the density dependence of the nuclear symmetry energy has been a longstanding goal of both nuclear physics and astrophysics. Recent observations of neutron stars, in both electromagnetic and gravitational-wave spectra, have already constrained significantly the nuclear symmetry energy at high densities. Training deep neural networks to learn a computationally efficient representation of the mapping between astrophysical observables of neutron stars, such as masses, radii, and tidal deformabilities, and the nuclear symmetry energy allows its density dependence to be determined reliably and accurately. In this work we use a deep learning approach to determine the nuclear symmetry energy as a function of density directly from observational neutron star data. We show for the first time that artificial neural networks can precisely reconstruct the nuclear symmetry energy from a set of available neutron star observables, such as, masses and radii as those measured by, e.g., the NICER mission, or masses and tidal deformabilities as measured by the LIGO/VIRGO/KAGRA gravitational-wave detectors. These results demonstrate the potential of artificial neural networks to reconstruct the symmetry energy, and the equation of state, directly from neutron star observational data, and emphasize the importance of the deep learning approach in the era of Multi-Messenger Astrophysics.


I. INTRODUCTION
Understanding the equation of state (EOS) of dense neutron-rich matter in terms of the fundamental interactions between its constituents is an extraordinarily challenging problem and represents a key outstanding question in modern physics and astrophysics [1][2][3].Due to its broad ramifications for many important phenomena, ranging from understanding the heavy ion collision dynamics in nuclear laboratories to the most violent cosmic events, such as binary neutron star (BNS) mergers and supernovae, to gravitational waves, the determination of the EOS of dense matter has been a major shared goal of both the nuclear physics (see, e.g., Refs.[4][5][6][7][8][9][10][11][12][13]) and astrophysics (see, e.g., Refs. ) communities.It has been a primary scientific thrust for establishing key research facilities in astrophysics [1] and nuclear physics [2], such as large ground-based telescopes, advanced X-ray space-borne observatories, the Neutron Star Interior Composition Explorer (NICER) * Plamen G. Krastev: plamenkrastev@fas.harvard.edu[36], LIGO/VIRGO/KAGRA [37][38][39] gravitational-wave detectors, and all advanced radioactive beam laboratories around the globe.
In cold neutron star matter, the nucleonic component of the EOS can be written in terms of the energy per nucleon ρ as [40] where E SN M (ρ) = E(ρ, 0) is the energy per nucleon of symmetric nuclear matter (SNM), E sym (ρ) is the symmetry energy, and δ = (ρ n − ρ p )/ρ is the isospin asymmetry, with ρ n , ρ p , and ρ = ρ n + ρ p being the neutron, proton, and total density, respectively.Presently, the EOS of cold nuclear matter under extreme conditions of density, pressure, and/or isospin asymmetry still remains rather uncertain and theoretically controversial, particularly at supra-saturation densities, mainly due to the poorly known high-density behavior of the nuclear symmetry energy E sym (ρ) [4,5].
To determine the EOS from first principles, we need to solve quantum chromodynamics (QCD), which is the fundamental theory of strong interactions.However, at present, model-independent results are only available for a rather limited density range.At low densities of ρ ∼ [1 − 2]ρ 0 [175], we can use ab initio approaches together with nuclear interaction derived from Chiral Effective Theory (χEFT) with controlled uncertainty estimates [41][42][43][44][45][46][47][48][49][50].At asymptotically high densities of ρ > ∼ 50 ρ 0 , perturbative QCD calculations converge and provide reliable results [51][52][53][54][55][56][57].At intermediate densities of ρ ∼ [2 − 10]ρ 0 , however, there are still no reliable QCD predictions [58].To derive the EOS from QCD in the intermediate-density region, one needs to develop non-perturbative approaches, such as the Monte Carlo simulation of QCD on a lattice (lattice QCD), but the application of these methods to systems at finite densities is hindered by the notorious sign problem; see, e.g., Ref. [59].Therefore, at intermediate densities, the EOS construction still relies on phenomenological approaches employing a variety of many-body methods and effective interactions, such as the relativistic mean field theory, and density functionals based on the Skyrme, Gogny, or Similarity Renormalization Group (SRG) evolved interactions.
Meanwhile, we have witnessed extraordinary progress in efforts to constrain the high-density EOS from both nuclear laboratory experiments with radioactive beams, and multi-messenger astrophysical (MMA) observations of neutron stars.In particular, extensive analyses of experimental data of heavy-ion reactions from intermediate to relativistic energies, especially various forms of nucleon collective flow and the kaon production, have already constrained significantly the EOS of SNM up to approximately 4.5 ρ 0 ; see, e.g., Ref. [6].In addition, thanks to the great efforts and collaboration of both the nuclear physics and astrophysics communities, significant progress has been made in the last two decades in constraining the symmetry energy around, and below, nuclear matter saturation density using results from both astrophysical observations and terrestrial nuclear experiments; see, e.g., Refs.[5,[9][10][11][60][61][62][63].However, the poorly known density dependence of the nuclear symmetry energy E sym (ρ) at supra-saturation densities, and the possible hadron-to-quark phase transition, still remain the most uncertain aspects of the EOS of dense matter [4,5,21,22,24].Furthermore, the appearance of various new particles, such as hyperons and resonances, is also strongly dependent upon the high-density trend of E sym (ρ) [64][65][66][67][68][69][70][71][72][73][74][75][76][77][78][79].Because, above the hadron-toquark transition density, the nuclear symmetry energy would naturally lose its physical meaning, it is critical to determine simultaneously both the high-density behavior of the symmetry energy and the detailed properties of the hadron-to-quark phase transition, analyzing combined data from astrophysical observations and nuclear laboratory experiments [5].
Recent MMA observations of neutron stars provide unique means to probe the high-density EOS and, in particular, the E sym (ρ) at densities currently inaccessible in the nuclear laboratories.Moreover, these new advances in neutron star (NS) observations have opened an alter-native pathway for the model-independent extraction of the symmetry energy, and the EOS, via statistical approaches (see, e.g., Refs.[35,[80][81][82][83][84][85]).These observations include the Shapiro delay measurements of massive ∼2M pulsars [86][87][88], the radius measurement of quiescent low-mass X-ray binaries and thermonuclear bursters [80-82, 89, 90], the X-ray timing measurements of pulsars by the NICER mission [91,92], and the detection and inference of gravitational waves from compact binary mergers involving NSs by the LIGO/VIRGO/KAGRA collaboration [93][94][95].Typical NS observables include mass M , radius R, moment of inertia I, quadrupole moment Q, dimensionless tidal deformability Λ (and derivatives, e.g., Love number k 2 and tidal deformability λ), and compactness M/R.Specifically, the NICER mission aims at the compactness M/R of NSs by measuring the gravitational lensing of the thermal emission from the stellar surface.On the other hand, gravitational-wave (GW) observations of BNS and neutron star-black hole (NSBH) mergers provide information on the tidal disruption of the star in the presence of its companion, quantified by the tidal deformability parameter λ.Some of these NS observables are related via EOS-independent universal relations, such as the well-known I-Love-Q relation, which relates I, k 2 , and Q [96,97].
There is a plethora of diverse statistical approaches to construct the most probable EOS from NS observational data, with the Bayesian inference [35,[80][81][82][83][84] as the most commonly used technique at the present.There are also other methods, such as those based on the Gaussian processes, which are variants of the Bayesian inferences with nonparametric representation of the EOS; see, e.g., Ref. [85].Despite the significant effort to extract the genuine EOS from the NS astrophysical data, it is still unclear what the true dense matter EOS should look like, mainly due to the uncertainties of the assumed prior distributions in the Bayesian analyses [58].Therefore, the need arises for alternative approaches to construct the model-independent EOS.Recently, approaches based on deep neural networks (DNNs) have gained interest in the research community and have been extensively explored and applied in a wide range of scientific and technical domains.Deep learning (DL) algorithms, a subset of machine learning (ML), are highly scalable computational techniques with the ability to learn directly from raw data, employing artificial neurons arranged in stacked layers, named neural networks, and optimization methods based on gradient descent and back-propagation [98,99].These techniques, especially with the aid of GPU computing, have proven to be highly successful in tasks such as image recognition [100], natural language processing [101], and recently also emerged as a new tool in engineering and scientific applications, alongside traditional High-Performance Computing (HPC) in the new field of Scientific Machine Learning [102].DL approaches have been already successfully applied in a wide range of physics and astrophysics domains; see, e.g., Refs.[103][104][105][106][107][108][109][110][111][112][113].In particular, DL has been applied in GW data analysis for the detection [114][115][116][117][118][119][120][121], parameter estimation [122,123], and denoising [124] of GW signals from compact binary mergers.In previous works [125,126], we also pioneered the use of DL methods, specifically Convolutional Neural Network (CNN) [127] algorithms, for the detection and inference of GW signals from BNS mergers embedded in both Gaussian and realistic LIGO noise.Several studies have also explored the DL approach as a tool to extract the dense matter EOS from NS observations [58,[128][129][130][131].
In this work, we explore a DL approach to extract the nuclear symmetry energy E sym (ρ), the most uncertain part of the EOS, directly from NS astrophysical data.Specifically, we train DNNs to map pairs of NS mass and radius M − R, or stellar mass and tidal deformability M − Λ, to E sym (ρ).We show, for the first time, that DL can be used to infer the nuclear symmetry energy directly from NS observational data accurately and reliably.Most importantly, we show that DL algorithms can be used successfully to construct a model-independent E sym (ρ) and therefore determine precisely the density dependence of the nuclear symmetry energy at supra-saturation densities.These results are a step towards achieving the goal of determining the EOS of dense neutron-rich matter, and emphasize the potential and importance of this DL approach in the MMA era, as an ever-increasing volume of NS observational data becomes available with the advent of the next generation of large telescopes and GW observatories.
This paper is organized as follows.After the introductory remarks in this section, in Section II, we discuss the main features and parameterization of the EOS applied in this work.In Section III, we briefly recall the formalism for solving the structure equations of static NSs and calculating the tidal deformability.In Section IV, we discuss the DL approach used to map the NS observables to the nuclear symmetry energy.We present our results in Section V.At the end, we conclude in Section VI with a short summary and outlook on future investigations.
Conventions: We use units in which G = c = 1.

II. EQUATION OF STATE
The EOS is the major ingredient for solving the NS structure equations and calculating global stellar properties, such as mass M , radius R, and dimensionless tidal deformability Λ.The most commonly used theoretical approaches to determine the nuclear EOS fall into two major categories-phenomenological and microscopic methods.Phenomenological approaches are based on effective interactions constructed to describe the ground state of finite nuclei and therefore applications to systems at high isospin asymmetries must be considered with care [132].Moreover, at large densities, no experimental data are available to constrain such interactions and therefore predictions based on these methods could be very different from the realistic behavior.Among the most used phenomenological approaches are methods based on Skyrme interactions [133,134] and relativistic mean-field (RMF) models [135].On the other hand, microscopic approaches start with realistic two-body and three-body nucleon forces that describe accurately freespace nucleon scattering data and the deuteron properties.Such interactions are either based on mesonexchange theory [136,137], or recent χEFT [46,[138][139][140].The major challenge for the many-body methods is the treatment of the short-range repulsive core of the nucleon-nucleon interaction, and this represents the difference among the available techniques.Among the most well-known microscopic many-body methods are the Brueckner-Hartree-Fock (BHF) approach [141] and its relativistic counterpart, the Dirac-Brueckner-Hartree-Fock (DBHF) theory [142,143], the variational approach [144], the Quantum Monte Carlo technique and its derivatives [145,146], the self-consistent Green's function technique [147], the χEFT [49], and the V low k approach [148].
Around saturation density ρ 0 , the E SN M (ρ) and E sym (ρ) predicted by many-body theories can be Taylor expanded as with x ≡ (ρ − ρ 0 )/3ρ 0 .The expansion coefficients in these expressions can be constrained by nuclear experiments and have the following meanings [149]: are the binding energy, incompressibility, and skewness of SNM; , and J sym ≡ [27ρ 3 d 3 E sym /dρ 3 ] ρ0 are the magnitude, slope, curvature, and skewness of the symmetry energy at ρ 0 .Currently, the most probable values of these parameters are as follows: MeV, and −200 ≤ J sym ≤ 800 MeV; see e.g., Ref. [150].Although, at higher densities, the Taylor expansions diverge themselves [151], Equations ( 2) and (3) can also be viewed as parameterizations where, in principle, the parameters are left free [150].In this respect, the above relations have dual meanings.Namely, for systems with low isospin asymmetries, they are Taylor expansions near the saturation density, while, for very neutron-rich systems at supra-saturation densities, they should be regarded as parameterizations [150].For further discussion on the relationship between the Taylor expansions and the parameterizations adopted in our analysis, the reader is referred to, e.g., Ref. [150].These parameterizations are often used in metamodeling of the NS EOS and have been applied previously, for instance, in solving the NS inverse-structure problem and constraining the highdensity symmetry energy by astrophysical observations of NSs [150,152].The NS EOS metamodel has been also applied in Bayesian analyses to extract the most probable values of the high-density parameters of the EOS, where the posterior Probability Distribution Functions (PDFs) of the EOS parameters and their correlations are inferred directly from NS observational data [153].The parameterizations described here have the advantage, over the widely used piecewise polytropes for directly parameterizing the pressure as a function of energy or baryon density of NS matter, of keeping the isospin dependence of the EOS, and they explicitly retain information on the composition for the whole density range, without losing the ability to model a wide range of EOSs as predicted by various many-body approaches.This feature of the metamodeling approach is particularly important for inferring the high-density symmetry energy parameters, or directly E sym (ρ), as it clearly separates the contribution of E sym (ρ) to the EOS.
For the purpose of our analysis, in the present study, we adopt the EOS metamodel described briefly above and, by varying the EOS parameters, generate a large number of EOSs and corresponding M − R, or M − Λ sequences, by solving the NS structure equations.We assume a simple model of matter in the NS core consisting of protons, neutrons, electrons, and muons (the npeµmodel) in β-equilibrium.With E SN M (ρ) and E sym (ρ), parameterized by Equations ( 2) and (3), E(ρ, δ) is calculated through Equation (1).Then, the pressure of NS matter in β-equilibrium can be computed from the energy density ε(ρ, δ) = , where M N is the average nucleon mass and ε l (ρ, δ) is the lepton energy density.Details for calculating ε l (ρ, δ) can be found in, e.g., Ref. [154].Below approximately 0.07 f m −3 , the core EOS is supplemented by a crustal EOS, which is more suitable at lower densities.For the inner crust, we apply the EOS by Pethick et al. [155] and, for the outer crust, the one by Haensel and Pichon [156].
We use Equations ( 2) and (3) as parameterizations, together with the parabolic approximation of the nucleonic EOS Equation (1), and fix E 0 , K 0 , and S 0 at their most probable currently known values from nuclear laboratory experiments and/or nuclear theories.Since the main focus of this analysis is specifically on extracting E sym (ρ) from NS observables, we fix the SNM EOS, E SN M (ρ), by setting J 0 = 0 MeV, and vary only E sym (ρ).The J 0 parameter controls the stiffness of the SNM EOS and in turn the maximum NS mass M max of the resultant stellar models.The maximum NS mass of 2.14 M observed so far [88] requires J 0 to be larger than −200 MeV, depending slightly on the symmetry energy parameters [157].At present, the predicted range of J 0 still has relatively large uncertainties [157], which also partially justifies our choice of setting J 0 = 0.With E SN M (ρ) kept fixed, the EOS is therefore solely determined by E sym (ρ).The EOS of SNM E SN M (ρ) is shown in the left window of Figure 1.For the purpose of our analysis, we also set the symmetry energy skewness parameter J sym = 0 MeV.This choice is also partially justified by the very large uncertainty range of J sym at present [157], but the main reason for setting J sym = 0 is to simplify our problem.In following works, we plan to consider the effect of both J 0 and J sym .This would allow for the modeling of a wider class of EOSs as predicted by various many-body approaches, and models of the nuclear interaction.Together with using realistic NS astrophysical observations, this data-driven approach would allow for extracting realistic symmetry energies, and in turn the EOS.With the above choices, we subsequently vary the symmetry energy parameters L and K sym to generate many samples of E sym (ρ), and the EOS.The effect of varying the L and K sym parameters on the symmetry energy is illustrated in the middle and right windows of Figure 1.While, in principle, these parameters are absolutely free, the asymptotic boundary conditions of the EOS near ρ 0 and δ = 0 provide some prior knowledge of the ranges of these parameters.The ranges of L and K sym are further restricted by imposing the requirement that the EOSs satisfy causality, and the resultant NS models can support a maximal mass of at least 2.14 M .The ranges of the symmetry energy E sym (ρ) and pressure P satisfying all constraints are shown in Figure 2.

III. NEUTRON STAR STRUCTURE EQUATIONS AND TIDAL DEFORMABILITY
In this section, we briefly review the formalism for calculating the NS mass M , radius R, and tidal deformability λ.For a spherically symmetric relativistic star, the Einstein's field equations reduce to the familiar Tolaman-Oppenheimer-Volkoff (TOV) [158] equation: where the gravitational mass within a sphere of radius r is determined by To proceed with the solution of the above equations, one needs to provide the EOS of stellar matter in the form p(ε). Starting from some central energy density ε c = ε(r = 0) at the center of the star, with the initial condition m(0) = 0, Equations ( 5) and ( 6) can be integrated until p vanishes, signifying that the edge of the star has been reached.Some care should be taken at r = 0 since, as seen above, the TOV equation is singular there.The point r = R where p 0 vanishes defines the NS radius and M = m(R) = 4π R 0 ε(r )r 2 dr determines the NS gravitational mass.
For a given EOS, there is a unique relationship between the stellar mass and the central density ε c .Thus, for a particular EOS, there is a unique sequence of NSs parameterized by the central density (or equivalently the central pressure p c = p(0)).The range of the M − R relation computed with the EOSs considered in this work is shown in the left window of Figure 3.
The tidal deformability λ is a parameter quantifying the tidal deformation effects experienced by NSs in coalescing binary systems during the early stages of an inspiral.This parameter is defined as [159][160][161] where Q ij is the induced mass quadruple moment of an NS in the gravitational tidal field E ij of its companion.
The tidal deformability can be expressed in terms of the NS radius, R, and dimensionless tidal Love number, k 2 , as The tidal Love number k 2 is calculated using the following expression [162,163]: where β ≡ M/R is the dimensionless compactness parameter and y R ≡ y(R) is the solution of the following first-order differential equation (ODE): with where c 2 s (r) ≡ dp(r)/dε(r) is the squared speed of sound.Starting at the center of the star, for a given EOS, Equation (10) needs to be integrated self-consistently together with Equations ( 5) and (6).Imposing the additional boundary condition for y at r = 0 such that, y(0) = 2, the Love number k 2 and the tidal deformability λ can be readily calculated.One can also compute the dimensionless tidal deformability Λ, which is related to the compactness parameter β and the Love number The range of Λ as a function of the stellar mass is shown in the right window of Figure 3.The total tidal effect of two neutron stars in an inspiraling binary system is given by the mass-weighted (dimensionless) tidal deformability (see, e.g., Refs.[159,161]): where Λ 1 = Λ 1 (M 1 ) and Λ 2 = Λ 2 (M 2 ) are the (dimensionless) tidal deformabilities of the individual binary components.As pointed out previously [159], although Λ is calculated for single neutron stars, the universality of the neutron star EOS allows us to predict the tidal phase contribution for a given binary system from each EOS.For equal-mass binary systems, Λ reduces to Λ.The weighted (dimensionless) deformability Λ is usually plotted as a function of the chirp mass for various values of the asymmetric mass ratio η = M 1 M 2 /M 2 T , where M T = M 1 +M 2 is the total mass of the binary.

IV. DEEP NEURAL NETWORKS (DNNS)
In this section, we briefly discuss the basic setup, structure, and workflow associated with implementing DNNs for our specific application.For more extensive discussions, the reader is referred to a number of machine learning articles [98,164] and textbooks [99,165].
Deep neural networks consist of processing units, named neurons, which are arranged in one to several layers (Figure 4a).A neuron acts as a filter, performing a linear operation between the neurons in the previous layer and the weights associated with the neuron.A DNN typically has an input layer, followed by one or more hidden layers, and a final layer with one or more output neurons.As illustrated in Figure 4a, in a feedforward DNN, calculations progress (from left to right) starting at the input layer and moving successively through the hidden layers until reaching the output layer.In classification problems, the output neurons give the probabilities that an input sample belongs to a specific class.In regression problems, the output layer returns estimates of one or several target parameters.Each neuron in a DNN performs a simple linear operation.Namely, for input values x i from the previous layer, it outputs a single value a = f ( x i w i + b); see Figure 4b,c.Activation functions f depend on the specific application but are typically chosen to be nonlinear or piecewise functions, such as the sigmoid or hyperbolic tangent functions [166]  w i +b) that is broadcasted to every neuron in layer m + 1 [165].
In this analysis, we apply a DL approach and formulate a regression problem, where the inputs to the DNNs consist of M (R) or M (Λ) sequences, while the outputs consist of E sym (ρ) estimates.Accordingly, the data sets consist of E sym (ρ) samples and M (R), or M (Λ), sequences.We use the EOS metamodel discussed in Section II, with the SNM part of the EOS kept fixed, and vary only the symmetry energy.In particular, we set E 0 = 15.9MeV, K 0 = 240 MeV, J 0 = 0 MeV, S 0 = 31.7 MeV, J sym = 0, and vary only L and K sym in Equations ( 2) and (3).Specifically, the values of L and K sym are sampled randomly from their respective ranges of [30.6-86.8]MeV and [−400-100] MeV.Recently, the latest results of the PREX collaboration suggested a rather high value of L with an upper limit at 143 MeV [167].Examining the effect of higher L values is left to following works.The resultant EOSs p(ε) are checked regarding whether they satisfy (i) the microscopic stability condition, i.e., dp dε ≥ 0, and (ii) the causality condition, i.e., the speed of sound c s ≡ dp dε ≥ c, which restricts the values of L and K sym and the E sym (ρ) samples.For each EOS, the NS structure equations are solved to obtain M (R) and M (Λ) sequences.To simulate NS observational data, from a given genuine M −R (or M −Λ) sequence, we randomly choose 50 points in the range of 1M to 2M [176].Then, each sample input is a vector of dimension 100, with the two arrays of M and R (or M and Λ) values concatenated.Similarly, each output sample is a vector of dimension 100 representing an estimated E sym (ρ) in the density range of ∼0.5ρ/ρ 0 to 5ρ/ρ 0 .In this respect, the DNN maps an input M (R) or M (Λ) sequence to an output E sym (ρ).Realistic NS observations inevitably accrue errors, which result in corresponding uncertainties when reconstructing the symmetry energy, and/or the EOS.For the purpose of this analysis, however, we do not take into account NS observational errors and uncertainties.This work should be regarded as a proof-of-concept study, and the application to realistic NS data is left to a future article.
In supervised learning, the data sets are divided into training, validation, and testing data.The training data set is used by the DNN to learn from, the validation data are used to verify whether the network is learning correctly, and the testing data are used to assess the performance of the trained model.The training data sets used in this work consist of 40,000 independent M (R), or M (Λ), sequences representing the DNN inputs, and 40,000 matching E sym (ρ) samples representing the DNN outputs.The validation and testing data sets consist of ∼1000 input samples and the same number of output samples each.
The neural networks used here are feedforward DNNs with 10 hidden, dense, fully connected layers of dimension 100, and ReLU activation functions.The first layer has a linear activation function and corresponds to the input to the neural network, which, in this case, is a one-dimensional concatenated vector containing the NS M and R (or Λ) values for a given M (R) (or M (Λ)) sequence.At the end, there is a linear output layer of dimension 100 returning the estimated E sym (ρ).The network design was optimized by fine-tuning multiple hyperparameters, which include here the number and type of network layers, the number of neurons in each layer, and the type of activation function.The optimal network architecture was determined through multiple experiments and tuning of the hyper-parameters.The feedforward DNN used in this work and its functionality is shown schematically in Figure 4.
To build and train the neural networks, we used the Python toolkit Keras (https://keras.io(accessed on September 28, 2021)), which provides a high-level application programming interface (API) to the Tensor-Flow [168] (https://www.tensorflow.org(accessed on September 28, 2021)) deep learning library.We applied the technique of stochastic gradient descent with an adaptive learning rate with the ADAM method [169] with the AMSgrad modification [170].To train the DNNs, we used an initial learning rate of 0.003 and chose a batch size of 500.During each training session, the number of epochs was limited to 2000, or until the validation error stopped decreasing.The training of the DNNs was performed on an NVIDIA Tesla V100 GPU and the size of the mini-batches was chosen automatically depending on the specifics of the GPU and data sets.We used the mean squared error (MSE) as a cost (or loss) function.

V. RESULTS
We first examine the ability of the DNN to reconstruct the E sym (ρ) from a set of mass and radius M − R mea-surements that may result from electromagnetic observations of neutron stars, such as those from the NICER mission, for instance.Specifically, we apply the trained DNN, described in the previous section, to a test data set containing ∼1000 simulated M (R) sequences, and compare the corresponding estimated output E sym (ρ) with the exact symmetry energy for each sample.In Figure 5, we show results for five representative examples from the test data set.It is seen that the estimated symmetry energy (broken colored lines) for each input M − R sequence matches almost exactly the "true" E sym (ρ) (solid black lines) over the entire density range considered here.The results are very similar for the rest of the test data samples.Quantitatively, at 5ρ 0 , the mean absolute error over the whole test data set is 1.2 MeV, with a standard deviation of 0.8 MeV, where the errors are even smaller at lower densities.Choosing different ensembles of randomly selected points from the genuine M (R) curves does not alter appreciably the accuracy with which the symmetry energy is estimated.
At this point, we need to reiterate that realistic NS observations inevitably carry uncertainties, which would result in corresponding uncertainties in the estimated symmetry energy.However, this work should be considered a "proof-of-concept study" and realistic applications are left for following articles.In addition, these results are based on the assumption that we have 50 NS M −R observations.Although, at present, such a large number may seem rather optimistic, with the advent of the next generation of electromagnetic observatories, this circumstance is rapidly changing as many more NS electromagnetic observations are expected in the future.These results clearly demonstrate the ability of the DL-based approach to extract the nuclear symmetry energy accurately from NS mass and radius measurements, given that observational data of sufficient quality exist.
We next look at the ability of the neural network to ex- As discussed in the previous section, the input of the neural network consists of 50 randomly chosen points from a given genuine M (Λ) curve in the range of 1-2 M , and the output consists of 100 fixed points of E sym (ρ) in the range ∼0.4-5 ρ 0 .Using different input ensembles of randomly selected points does not change significantly the results presented in Figure 6.These findings show that the E sym (ρ) can be extracted accurately, via artificial neural networks, directly from GW observational data of neutron stars.They also emphasize the importance of the DL-based approach for extracting the symmetry energy, the most uncertain component of the dense matter EOS, from GW observations.As in the case of NS mass and radius measurements, GW observations of neutron stars inevitably accrue errors, which result in corresponding uncertainties in the extracted symmetry energy, and the EOS.As already mentioned, realistic applications of the DL method are left to following works.
These novel computational techniques will become particularly important in the future with the advent of the next generation GW observatories, when millions of GW events will be routinely detected per year, with at least several events involving neutron stars per day.As more such GW events are detected and characterized, this data-driven approach will eventually allow us to map precisely the GW observations of neutron stars to the E sym (ρ), and in turn, the underlying EOS of dense nuclear matter.

VI. SUMMARY AND OUTLOOK
We have demonstrated, for the first time, the reconstruction of the nuclear symmetry energy directly from MMA observations of neutron stars using deep learning approaches.Specifically, we have shown that deep neural networks can extract E sym (ρ) accurately from either mass and radius M − R or mass and tidal deformability M − Λ astrophysical measurements of neutron stars.These results are a step towards realizing the goal of determining the EOS of dense nuclear matter, and they underline the importance and potential of the DL-based approach in the era of multi-messenger astrophysics, where an ever-increasing volume of MMA data is becoming rapidly available.
Future directions include considering the complete set of EOS parameters in Equations ( 2) and (3).Specifically, we will also take into account the effect of J 0 and J sym , which are set to J 0 = J sym = 0 in the current analysis.Despite the large current uncertainties of these higherorder parameters, their inclusion will allow us to model a much wider range of realistic EOSs, and thus enable a direct comparison of E sym (ρ) estimates, obtained with the DL techniques developed in this work, with realistic symmetry energies.Astrophysical observations of neutron stars inevitably carry error uncertainties, which lead to corresponding errors and uncertainties in the extracted symmetry energy, and EOS.For realistic applications of our approach, the effect of empirical errors and uncertainties needs to be considered and implemented consistently in the formalism.This can be achieved, for instance, by recasting the E sym (ρ) regression problem into a probabilistic framework.In subsequent works, we plan to implement Bayesian neural networks to perform the symmetry energy inference task.In this paradigm, instead of having deterministic values, the weights of these networks are characterized by probabilistic distributions by placing a prior over the network weights [171].Finally, we also plan to investigate likelihood-free inference methods based on normalizing flows [172].By applying a series of nonlinear transforms to a simple posterior shape (e.g., a multivariate Gaussian), the flow is able to reproduce complex posteriors without evaluating the likelihood directly.These techniques have already gained considerable interest in the research community, and, recently, a likelihood-free inference method based on normalizing flows was applied [173] to rapidly perform parameter estimation of eight BBH GW events in the first LIGO catalog, GWTC-1 [174].Such rapid processing will be particularly important for next-generation space telescopes and GW detectors, whose sensitivity goals will allow for the detection and observation of compact binary collisions, and neutron stars, throughout the history of the universe, exceeding a million events per year, with thousands of BNS detections/year alone.Conventional Bayesian inference approaches are not scalable to the study of thousands of BNS events per year, and modern normalizing flow models could certainly help to extract BNS parameters promptly and accurately.
Ultimately, as more events involving neutron stars are observed, these modern data-driven approaches will allow us to rapidly process the ever-increasing amount of neutron star observational data and determine precisely the nuclear symmetry energy and the EOS of dense nuclear matter.

FIG. 2 :
FIG. 2: Range of the nuclear symmetry energy Esym (left window) and pressure P (right window).The Esym and P are plotted as functions of the reduced density ρ/ρ0.

FIG. 3 :
FIG. 3: Range of mass-radius relation (left window) and dimensionless tidal deformability Λ (right window) for the EOSs considered in this study.Λ is plotted as a function of stellar mass M .The mass ranges of the three heaviest pulsars known at present [86-88] are indicated in the left window.

FIG. 4 :
FIG. 4: (a) Structure of an example deep neural network with input, hidden, and output layers highlighted.In the feedforward neural networks used in this study, calculations move from left to right, as illustrated in the figure.(b) Inputs to an example neuron from the previous layer.Also shown is the calculation performed by the example neuron, with inputs weighted relative to one another, bias added, and an activation function applied, in order to calculate a value a, referred to as the activation of the neuron.(c) The activation of the example neuron serves as one of the inputs to the next layer of neurons.Each neuron in the successive layers of the DNN is performing this same operation with different values of the tunable parameters wi and b, within the backpropagation algorithm.See text for details.

1 , a m−1 2 , 1 N
(Figure 4b).The weights w i and bias b are unique to each neuron and are parameters that are tuned iteratively via a backpropagation algorithm during the DNN training (Figure 4b).Neuron k in layer m accepts the outputs a m−1 . . ., a m−from all N from the previous layer m−1 and computes a single value a m k = f ( a m−1 i

FIG. 5 :
FIG. 5: Example input M (R) sequences (left window) and corresponding estimated Esym(ρ) (right window).The input samples consist of 50 randomly selected points, denoted by the "o" characters, from the genuine M (R) curves, denoted by the solid lines, in the range of 1-2 M .The output data samples consist of 100 Esym(ρ) points in the range of ∼0.4-5 ρ0.Broken colored lines in the right window denote the estimated Esym(ρ) and the solid lines represent the exact Esym(ρ).Same curve colors in both windows denote pairs of input M (R) sequences and corresponding output symmetry energy.

FIG. 6 :
FIG. 6: Example input M (Λ) sequences (left window) and corresponding estimated Esym(ρ) (right window).The input samples consist of 50 randomly selected points, denoted by the "o" characters, from the genuine M (Λ) curves, denoted by the solid lines, in the range of 1-2 M .All other figure features are the same as in Figure 5. See text for details.