Determination of the Cosmic-Ray Chemical Composition: Open Issues and Prospects

Cosmic rays are relativistic particles that come to the Earth from outer space. Despite a great effort made in both experimental and theoretical research, their origin is still unknown. One of the main keys to understand their nature is the determination of its chemical composition as a function of primary energy. In this paper, we review the measurements of the mass composition above $10^{15}$ eV. We first summarize the main aspects of air shower physics that are relevant in composition analyses. We discuss the composition measurements made by using optical, radio, and surface detectors and the limitations imposed by current high-energy hadronic interaction models that are used to interpret the experimental data. We also review the photons and neutrinos searches conducted in different experiments, which, in addition to being important to understand the nature of cosmic rays, can provide relevant information related to the abundance of heavy or light elements in the flux at the highest energies. Finally, we summarize the future composition measurements that are currently being planned or under development.


Introduction
Despite having been discovered more than a century ago, the origin and nature of cosmic rays remain uncertain. However, in the last decades, great progress has been made in the understanding of this phenomenon. The cosmic-ray energy spectrum extends from ∼10 9 eV up to more than 10 20 eV. Due to the very fast drop of the flux for increasing values of primary energy, the most energetic particles cannot be detected directly. Therefore, for energies above ∼10 15 eV, the cosmic rays are detected by measuring the extensive air showers (EASs) that they generate due to their interaction with the air molecules. Ground-based detectors that cover very large areas are built to measure those EASs. Although these observatories allow to detect very small values of the cosmic-ray flux, the characteristics of the primary particle have to be reconstructed from the EAS data, which makes the data analyses more complex, increasing the sources of systematic uncertainties.
There are two main different types of detectors used to detect the cosmic rays of energies above 10 15 eV. These are: surface detectors that measure the secondary particles of the EAS at ground level, and radiation detectors that measure the electromagnetic radiation emitted during the shower development in the atmosphere. Due to the steep drop of the flux, the observatories dedicated to detect cosmic rays of higher energies require larger collection areas.
There are three main observables used to study the cosmic rays: the energy spectrum, the chemical composition of the primary particles, and the distribution of their arrival directions. The energy spectrum of the cosmic rays has been measured by several experiments as can be seen in Figure 1. It can be described as a broken power law with spectral indexes close to γ = 3.
It presents four main features: (i) the knee, a steepening of the flux located at ∼10 15.6 eV, (ii) the second knee, a second steepening of the flux located at ∼10 17 eV, (iii) the ankle, a hardening of the flux located at ∼10 18.7 eV, and (iv) the suppression located at ∼10 19.6 eV. This last feature is a strong steepening of the flux with γ > 4; it may be an exponential drop of the flux but the current statistics are too low to determine it. 13 14 15 16 17 18 19 20 log(E/eV) 10 Figure 1. Cosmic-ray flux, multiplied by E 3 , as a function of the logarithmic energy. The data are taken from Refs. [1][2][3][4][5][6][7][8][9]. Adapted from [10].
It has long been believed that the galactic cosmic rays dominate the flux at low energies and extragalactic ones at high energies. This is based on the known inefficiencies of the galactic sources to accelerate particles to the highest energies. Moreover, the Pierre Auger Observatory (hereafter Auger) has found strong evidence about the extragalactic origin of the cosmic rays with energies above 10 18.9 eV [11]. The transition between the galactic and extragalactic components is still an open problem in cosmic-ray physics. In fact, the scenarios in which the transition takes place at the ankle are disfavored by the Auger data [12]. Therefore, the transition between these two components should take place somewhere between ∼10 16 and ∼10 18 eV.
The determination of the mass composition as a function of the primary energy is crucial to understand many aspects of cosmic rays. However, the main limitation is that the composition determination is based on the comparison of experimental data with EAS simulations. Because the high-energy hadronic interactions relevant for the cosmic rays are unknown, models that extrapolate lower-energy accelerator data are used in shower simulations. This practice introduces systematic uncertainties that can be quite large depending on the shower observable under consideration.
The relevant parameter for the motion of charged particles in magnetic fields is the rigidity, R = E/(Z × e), where E is the energy of the particle, Z the charge number, and e the absolute value of the electron charge. The trajectories of particles with smaller rigidity values are strongly deviated by a given magnetic field. The acceleration of the cosmic rays and also its propagation on the galactic and extragalactic magnetic field depends on the rigidity. Therefore, the composition information is relevant for the study of the cosmic-ray sources and its propagation through the Galaxy and the intergalactic medium. In fact, due to the intensity of the galactic magnetic field, the identification of extragalactic sources is possible only considering light cosmic rays at the highest energies (large rigidity values) [13].
The composition information is also very important for the determination of the transition between the galactic and extragalactic components and the origin of the ankle [14,15]. The composition studies are also relevant to constrain the models of the suppression [16]. The composition is also key information to study the high-energy hadronic interactions beyond the reach of man-made accelerators [17][18][19].
The determination of the composition beyond the ankle is also important to predict the fluxes of cosmogenic photons and neutrinos. These particles are generated as a by-product of the propagation of high-energy nuclei in the intergalactic medium [20]. In particular, they are generated due to the interaction of the cosmic rays with the radiation field present in the universe. Higher neutrino and photon fluxes are expected for lighter composition profiles at the highest energies (see, for instance, Ref. [14]). Moreover, the measurement of the cosmogenic neutrino and photon fluxes can be very useful to constrain the composition in the suppression region [21,22]. This paper is organized as follows: In Section 2, we discuss the main aspects of the physics of the EAS oriented to composition analyses. In Section 3, we review the composition measurements performed above 10 15 eV by using optical, radio, and surface detectors. In Section 4, we review the photons and neutrinos searches at the highest energies conducted in several experiments. In Section 5, we discuss the future of the composition measurements with current and future detectors. Finally, the main conclusions are discussed in Section 6.

Physics of Extensive Air Showers
The characteristics of the EASs initiated by the cosmic rays depend on the primary type. The differences among the EASs initiated by different primary types are used to identify the nature of the primary. Even though the detailed knowledge of the shower characteristics is studied through full Monte Carlo simulations, there are simple models that allow to understand the main physical properties of the EAS. A simple model of electromagnetic showers, i.e., initiated by photons, electrons, or positrons, was first introduced by Heitler [23] and was extended to hadronic showers by Matthews [24]. In the Heitler model, two processes are responsible for the electromagnetic shower development: Bremsstrahlung and pair production undergone by high-energy photons and electrons or positrons, respectively. Each electromagnetic particle (photons, electrons, and positrons) undergoes a two-body splitting at a fixed distance d = λ r ln 2, where λ r is the radiation length in the medium (for the EAS, the medium is the atmosphere). The energy of each mother particle is distributed equally between the two daughter particles. The creation of particles ceases when its energy becomes smaller than the one necessary for undergoing Bremsstrahlung or pair production. Heitler takes this critical energy, E e c , as the energy for which the radiative energy loss is equal to the collisional energy loss. In this simple model, the maximum of the shower is reached when the energy of each electromagnetic particle is equal to the critical energy. Therefore, E/2 n c = E e c , where E is the primary energy and n c is the number of splitting lengths required to reach the maximum. The mean atmospheric depth at which the shower maximum is reached is given by X em max = n c λ r ln 2, which, after writing n c as a function of the primary energy and the critical energy, becomes This simple model predicts quite well the values of X em max obtained from simulations. Note that X em max is linearly dependent on the logarithmic energy.
In the extension of the Heitler model for the hadronic showers developed by Matthews (here on referred to as the Heitler-Matthews model), the atmosphere is also divided by layers of fixed thickness. Both the primary particle and each secondary hadron interact after passing through one layer of the atmosphere. Each hadron produces N ch charged pions (π ± ) and N ch /2 neutral pions (π 0 ). It is assumed that the neutral pions decay into two photons immediately after being created. These high-energy photons initiate electromagnetic showers. The charged pions continue their multiplication process until they all reach the critical energy, E π c , which is taken as the energy for which the pion decay length is equal to the thickness of one layer. At this point, all charged pions decay producing muons and antimuons. Therefore, the hadronic showers are composed of three main components: electromagnetic, muonic, and hadronic (neutrinos and antineutrinos are also created in EASs, mainly in charged pion decays).
Let us consider first showers initiated by protons. Assuming that the energy is equally distributed among all created particles, the number of layers required to reach the point in which all charged pions decay is given by where E is the energy of the primary proton. As mentioned before, the number of muons (hereafter muons refers to muons and antimuons) is equal to the number of charged pions that reached the critical energy, N p µ = N n c ch . Therefore, using Equation (2), the following expression for the average number of muons is obtained where β = ln N ch ln 3 2 N ch .
Therefore, the Heitler-Matthews model predicts a power-law dependence with the primary energy of the muons number in proton-initiated air showers.
In the Heitler-Matthews model, it is assumed that, in the first interaction, one-third of the proton energy is injected into the electromagnetic channel. The neutral pions generated in the subsequent steps of the cascade also feed the electromagnetic channel, but the energy directed to this channel is smaller than the one corresponding to the first interaction. Therefore, in this model, only the first interaction is considered to calculate the mean value of X max .
In the first interaction N ch /2, neutral pions are generated. Then, the number of photons generated in the decay of the neutral pions is N ch . The energy of each neutral pion generated in the first interaction is E π 0 , f = E/(3N ch /2), and then the energy of each photon generated after the decays of the neutral pion is E γ, f = E π 0 , f /2 = E/(3N ch ). Considering that the mean atmospheric depth of the first interaction is X 0 and that at this point N ch showers are initiated by photons of energy E γ, f , the mean depth of the shower maximum is obtained Equation (5) can be written in terms of the mean value of X max corresponding to photoninitiated showers. By using Equation (1), the following expression is obtained where X γ max is the atmospheric depth of the maximum shower development of photon-initiated showers.
The elongation rate measures the rate of change of the mean value of X max with logarithmic energy. It is defined as Note that if D 10 = d X max /d log E, then D 10 = ln(10)D.
The elongation rate of proton showers can be obtained from Equations (6) and (7). It is given by which implies that D p < D γ , because N ch increases and X 0 decreases with primary energy. This result is known as the Linsley's elongation rate theorem [25], which states that the elongation rate of showers initiated by any hadron cannot exceed the one corresponding to photon showers. The number of muons and the depth of the shower maximum for heavier primaries can be obtained from the superposition model. In this simple model, a primary nucleus of mass number A and energy E is considered as A independent nucleus of energy E/A each. This approximation relies on the fact that the binding energy of the nucleus is much smaller than the primary energy (see Ref. [14] for a more detailed discussion). Therefore, by using the superposition model, the following expressions are obtained which, from Equations (3) and (5), take the following form As mentioned before, the detailed study of the EAS characteristics is conducted from full Monte Carlo simulations. The Monte Carlo programs most used in the literature are CORSIKA [26] and AIRES [27]. Moreover, the CONEX [28] program is widely used in the literature as it combines the Monte Carlo technique with the numerical solution of the cascade equations to obtain the longitudinal development of the showers, which considerably reduces the computing time compared to the full Monte Carlo simulations. The main limitation of the EAS simulations originates from the lack of knowledge of the hadronic interactions at the highest energies. Models that extrapolate the low-energy accelerator data to the required energies are included in the Monte Carlo programs. The high-energy hadronic interaction models commonly used in the literature have been updated recently with the Large Hadron Collider (LHC) data. However, some discrepancies among the predictions obtained from different models are still present, even at LHC energies, due to the limited collider data in the central region, which is relevant for the EAS simulation. These differences in the shower observables predicted by different high-energy hadronic interaction models introduce large systematic uncertainties in composition analyses. The post-LHC high-energy hadronic interaction models most used in the literature are: EPOS-LHC [29], Sibyll 2.3d [30], and QGSJet-II.04 [31].
Despite the strong simplifications in the Heitler-Matthews model, it predicts quite well the functional dependence of the mean number of muons and the mean depth of the shower maximum with primary energy and mass number. However, the mean value of X A max predicted by the model is about 100 g cm −2 smaller than the one obtained from EAS simulations [24]. In Ref. [24], it is suggested that these discrepancies arise due to the non-inclusion of the photons, originated in neutral pion decays, produced beyond the ones corresponding to the first interaction. In Ref. [32], it is found that this is not the origin of the discrepancies. Moreover, in that work, it is proposed that the discrepancies arise from the assumption of a homogeneous distribution of energy among the particles created in each step of the cascade.
As mentioned before, the Heitler-Matthews model predicts the functional dependence of X A max with A and E (see Equation (12)). This is shown in the analysis performed in Ref. [33], made based on EAS simulations, where it is found that the mean value of X A max can be written as where F E is a function of primary energy. Regarding the number of muons, it is also found that its dependence with A and E can be approximated by the one predicted by the Heitler-Matthews model (see Equation (11)). Moreover, it can be seen that the β parameter ranges from 0.85 to 0.93 [24], such that the values close to β = 0.93 correspond to post-LHC models [34].
From the previous discussion, it can be seen that the EAS generated by light nuclei develop deeper in the atmosphere and have a lower muonic content than an EAS generated by heavy nuclei. Therefore, the X max parameter and also any parameter closely related to the muon content of the showers (such as the number of muons at ground, measured at a given distance to the shower axis or even in a given distance range) can be used to determine the nature of the primary particle. Moreover, these two parameters showed to be the best to discriminate between proton and iron primaries (see, for instance, Ref. [35]).
The longitudinal development of the EAS is characterized by the number of charged particles present at a given slant depth, N ch (X). However, a quantity that is more closely related to the observation of the longitudinal development of an EAS with fluorescence telescopes is the energy deposition rate dE dX , i.e., the longitudinal profile, whose integral gives the total energy deposited in the atmosphere. Because N ch (X) is, to a very good approximation, proportional to the longitudinal profile, the maximum reached by both functions is located at approximately the same atmospheric depth.
The left panel of Figure 2 shows the longitudinal profiles of proton and iron showers of E = 10 19 eV, zenith angle θ = 40 • , simulated with CONEX 2r7.5 and EPOS-LHC as the high-energy hadronic interaction model. From the figure, it can be seen that effectively the proton showers develop deeper in the atmosphere than the iron showers. It can also be seen that shower-to-shower fluctuations are larger in the case of proton showers. The right panel of Figure 2 shows the corresponding distributions of X max . Note that both the mean value and the standard deviation of X max are both sensitive to the primary mass.
The left panel of Figure 3 shows the number of muons, with energy above 1 GeV, as a function of the atmospheric depth, for proton-and iron-initiated air showers of E = 10 19 eV, θ = 40 • , simulated with CONEX 2r7.5 and EPOS-LHC as the high-energy hadronic interaction model. The right panel of the figure shows the distributions of the total number of muons at ground. They are obtained evaluating N µ (X) at the atmospheric depth of the observation level, which is located at sea level. From the figure, it can be seen that the iron showers have a larger muon content than proton showers and that the shower-to-shower fluctuations are smaller for iron showers as in the case of X max . It can also be seen that the total number of muons is also a very good parameter to discriminate between proton and iron showers. Note that the number of electrons (hereafter electrons will refer to electrons and positrons) of the showers is much    Figure 4 shows X max as a function of N µ for proton and iron showers for different values of primary energy. From the figure, it can be seen that these two parameters are very good parameters to separate proton and iron showers from 10 15 to 10 20 eV, i.e., the whole energy range relevant for cosmic-ray detection through EAS observations. It can also be seen that the combination of these two parameters increases the discrimination power with respect to the case in which any of them is considered individually. This is valid for the whole energy range considered.
There are many parameters, besides X max and the muon content of the showers, that are sensitive to primary mass. Among others are: the slope of the lateral distribution function, the rise time of the signals collected by ground detectors, the maximum of the muon production depth distribution, etc. Note that all these parameters depend on X max and N µ . In any case, the discrimination power between heavy and light primaries of the X max and N µ combination increases very little when more mass-sensitive parameters are added [36].
It is worth mentioning that different cosmic-ray observatories have measured events, different from the vast majority, that cannot be explained with current EAS models [37][38][39]. These events are characterized by large multiplicities of triggered stations and large signals. Auger and Telescope Array have found that these types of events are observed during thunderstorms. Even though their origin is still unknown, it is believed that those events may originate in bursts of high-energy photons produced during thunderstorms known as downward gamma-ray flashes.

Composition Measurements
In general, there are two types of detectors that allow the detection of EASs: the ones that detect the secondary particles of the showers at ground level and the ones that measure the lowenergy electromagnetic radiation generated due to the shower front propagation through the atmosphere during the shower development. The secondary particles at ground are measured with surface detectors and the low-energy radiation produced during the shower development is measured with radiation detectors, which include optical and radio detectors.

Composition from Optical and Radio Detectors
There are two main optical techniques to measure the longitudinal development of the showers, which allow to reconstruct the X max parameter directly. These two experimental techniques involve: fluorescence telescopes and non-imaging Cherenkov detectors. The radio detectors allow to measure the X max parameter in an indirect way.
The fluorescence telescopes measure the fluorescence light emitted by the nitrogen molecules of the atmosphere that are excited by the charged particles of the EAS. The spectrum of the fluorescence light ranges from 300 to 400 ns. The fluorescence light yield is proportional to the energy deposited by the charged particles of the showers in the atmosphere. Therefore, measuring the fluorescence light during the shower development, it is possible to reconstruct the longitudinal profile of the showers. The fluorescence telescopes are composed by a faceted mirror and a camera formed by a large number of small photomultipliers. The telescopes are housed in dedicated climate-controlled buildings. The reconstruction of the longitudinal profile requires the determination of the shower axis. It can be determined observing the shower development with one telescope (monocular observation), but a much better accuracy is achieved when the same shower is observed with two telescopes (stereo observation). A similar accuracy to the one corresponding to the stereo observation is obtained when the data of at least one telescope are combined with the data taken by surface detectors (hybrid observations). Note that the fluorescence telescopes can take data in clear and moonless nights, which reduces the duty cycle to ∼15%. Moreover, this experimental technique requires a continuous monitoring of the atmosphere.
The fluorescence technique was used in the past in experiments by the Fly's Eye [40] and its successor, HiRes [41]. The Fly's Eye telescope operated 10 years in monocular mode; after that, an additional fluorescence telescope was added in order to perform stereo observations. Therefore, in the second phase of the Fly's Eye experiment and during the operation of HiRes, it was possible to perform stereo observations. At present, there are two observatories that are currently taking data and make use of this technique: one is Auger [42] and the other one is Telescope Array [43]. Both observatories can perform stereo observations as well as hybrid observations.
As mentioned before, the energy deposition rate of the showers, dE dX , can be measured by using the fluorescence technique. The integral of this profile gives the total energy dissipated electromagnetically. Therefore, in the fluorescence technique, the atmosphere is used as a calorimeter and the integral of the longitudinal profile gives the calorimetric energy. The calorimetric energy is 80 − 90% of the primary energy, and the remaining 10 − 20% corresponds to the so-called "invisible energy", which is carried away by neutrinos and high-energy muons and has to be estimated and added to the calorimetric energy in order to reconstruct the primary energy (see Ref. [44] for details). Figure 5 shows the longitudinal profile of a hybrid event detected by Auger [45]. The longitudinal profile is fitted with a Gaisser-Hillas function which is given by where f max , X 0 , λ, and X max are free-fitting parameters. For the event in the figure, the primary energy is E 0 = (3.0 ± 0.2) × 10 19 eV and the depth of the shower maximum is X max ∼ = 723 g cm −2 .
It is worth mentioning that the determination of the primary energy by using the fluorescence technique is little influenced by the high-energy hadronic interaction models used to simulate the showers. This is the main reason for the hybrid design of Auger and Telescope Array. In contrast with the fluorescence telescopes, the duty cycle of the surface detectors is ∼100%. Therefore, the calibration in energy of the events recorded by the surface detectors only is performed by using hybrid events for which the reconstructed energy is taken as the one obtained from the fluorescence telescopes. In this way, the reconstructed energy of the events recorded by the surface detectors only are subject to smaller systematic uncertainties compared with the case in which the energy calibration is performed by using simulations of the showers.
The non-imaging Cherenkov detectors measured the Cherenkov radiation that is generated by the charged particles of the EAS when they propagate through the atmosphere. Note that the Cherenkov radiation is emitted when a charged particle propagates through a medium at a velocity larger than the speed of light in that medium. The Cherenkov photons are detected by arrays of photomultipliers (like particle detectors) that look upward. The observations are conducted on moonless and clear nights, which reduces the duty cycle to 10-15%. This tech-  (14)). The primary energy of the event is E = (3.0 ± 0.2) × 10 19 eV and the maximum is reached at X max ∼ = 723 g cm −2 . Adapted from [45].
nique requires a continuous monitoring of the atmosphere. Several cosmic-ray observatories applied the non-imaging Cherenkov technique to detect air showers. The three non-imaging Cherenkov detector arrays that are currently in operation are the ones installed in Yakutsk [46], Tunka [47], and in the Telescope Array site called NICHE [48]. The primary energy and the depth of the shower maximum of each detected shower are obtained from the measured lateral distribution function of Cherenkov photons [46,47,49]. The left panel of Figure 6 shows the lateral distribution of Cherenkov photons corresponding to an event measured by the Yakutsk array [50]. The primary energy and zenith angle of the event are E ∼ = 1.3 × 10 19 eV and θ ∼ = 25 • , respectively. The reconstructed depth of the shower maximum is X max ∼ = 738 g cm −2 . Even though the methods to reconstruct the primary energy and X max can be quite sophisticated, it is known that the primary energy is nearly proportional to the photons' density at 120 m from the shower axis, and the depth of the shower maximum is closely related to the slope of the lateral distribution function [51,52]. Note that this technique also allows a calorimetric estimation of the primary energy.
Air showers emit electromagnetic radiation at radio frequencies. This radiation is strongly beamed in the forward direction. The two main mechanisms for the radio emission are geomagnetic and Askaryan emissions [53,54]. The Cherenkov emission is also present but negligible at radio frequencies. The dominant mechanism is the one associated to the geomagnetic field. In this case, the radiation originates from the interaction of the secondary electrons and positrons of the showers with the geomagnetic field, inducing a time-dependent transverse current. On the other hand, the Askaryan or charge excess emission is produced when the air shower particles ionize the atmosphere, and the ionization electrons are added to the cascade producing a negative charge excess located in the shower front region. Heavy and positive ions remain behind the shower front. In this mechanism, the time-dependent charge excess is responsible for the radio emission.
The radio emission from the EAS is emitted in a wide frequency interval. It has been measured from ∼2 to ∼500 MHz. The radio emission is detected through the ground arrays of radio antennas that can measure this radiation in different frequency bands. The radio signal measured by the antennas is not symmetric with respect to the shower axis [53,54]. These asymmetries depend on the relative contribution of the geomagnetic and charge excess emissions. There are several methods to reconstruct X max from the radio signals [53,54]. In particular, the slope of the lateral distribution function is used to reconstruct X max . The right panel of Figure 6 shows the lateral distribution function of radio, after asymmetry correction, of an event measured by the Tunka experiment [55]. It is worth mentioning that the radio technique allows a calorimetric determination of the primary energy, and also, the radio antennas have a duty cycle close to 100%. The experiments that are operating at present and that have measured X max by using the radio technique are: Auger [56], Yakutsk [57], Tunka [58], and LOFAR [59]. Figure 7 shows the mean value of the depth of the shower maximum as a function of the logarithmic primary energy measured by different experiments. The figure also shows the predicted values of X max for proton and iron primaries obtained from shower simulations. The simulated air showers used to obtain the model predictions are generated with CONEX 2r7.5 by using the high-energy hadronic interaction models: Sibyll 2.3d, QJSJet-II.04, and EPOS-LHC. The primary energy of the simulated air showers ranges from log(E/eV) = 15 to log(E/eV) = 20 in steps of ∆ log(E/eV) = 1 and the zenith is θ = 40 • . Finally, the curves in the figure are obtained by fitting the mean value of X max as a function of the logarithmic energy with a second-order polynomial in log(E/eV). Note that the X max measured by the Telescope Array (TA FD in the figure) is shifted by 5 g cm −2 to take into account the detector effects [60].
From Figure 7, it can be seen that, at low energies, the TALE data present large differences with the Tunka and Yakutsk data. These differences are reduced at energies of the order of and even larger than 10 17 eV. The TALE data are compatible with a light composition at ∼10 15.4 eV that becomes even lighter for increasing energies. At energies close to 10 16 eV, they are compatible with a composition that starts to change, becoming heavier as the energy increases. The Tunka data, starting at ∼10 15.1 eV, are compatible with a light composition that keeps nearly constant up to energies close to 10   FD, Ch, and R in the legend refer to fluorescence, Chrerenkov, and radio techniques, respectively. The data of Auger are labeled as: Auger FD [61], Auger SD (in this case, X max is obtained from surface detectors data calibrated with the fluorescence detectors data) [62], and Auger R [56]. The data of Telescope Array are labeled as: TA FD [63] and TALE FD [64]. The data of Yakutsk are labeled as: Yakutsk Ch [50] and Yakutsk R [57]. The data of Tunka are labeled as: Tunka Ch [65] and Tunka R [58]. The data of LOFAR are labeled as: LOFAR R [59]. The curves correspond to model predictions for proton and iron primaries simulated by using different high-energy hadronic interaction models.
even though the experimental data show a transition to heavier primaries from ∼10 16 eV, the differences between experiments suggest the existence of important systematic uncertainties in the experimental techniques used to obtain the mean value of X max as a function of primary energy. Figure 7 also shows that the Yakutsk data are compatible with a change in the composition profile at ∼10 17 eV. At larger values of the primary energy, these data are compatible with a variable composition that goes from intermediate mass to lighter nuclei. This change is also observed in the TALE and LOFAR data but at higher energies. The Auger data, starting at 10 17.26 eV, are also compatible with a variable composition that becomes lighter for increasing values of primary energy. Between ∼10 17.5 and ∼10 18.5 eV, the data are compatible with a light composition. At ∼10 18.5 eV, these data are compatible with a composition that becomes heavier for increasing values of primary energy. Even though there are differences between the Yakutsk and Auger data, they show a similar trend. Above ∼10 18.25 eV, the Telescope Array data seem to be consistent with a constant and light composition, but it has been shown that it is compatible with Auger data considering current statistical and systematic uncertainties [60].
As mentioned before, the standard deviation of the X max parameter, σ[X max ], can also be used to study the composition of the primary particle. Not all experiments considered have reported the σ[X max ] data. Figure 8 shows σ[X max ] as a function of the logarithmic energy measured by different experiments. The figure also shows the predicted values of σ[X max ] for proton and iron primaries obtained from shower simulations. The same shower library used to calculate X max predictions is used to calculate σ[X max ] predictions. Additionally, in this case, the curves in the figure are obtained by fitting the σ[X max ] as a function of the logarithmic energy with a second-order polynomial in log(E/eV). Note that σ[X max ] measured by the Telescope Array (TA FD in the figure) is obtained by subtracting 15 g cm −2 in the quadrature to take into account the detector effects [66]. 16 QGSJet-II.04 EPOS-LHC Figure 8. Standard deviation of X max as a function of logarithmic energy measured by different experiments. FD, Ch, and R in the legend refer to fluorescence, Chrerenkov, and radio techniques, respectively. The data of Auger are labeled as: Auger FD [61] and Auger R [56]. The data of Telescope Array are labeled as: TA FD [63]. The data of Yakutsk are labeled as: Yakutsk Ch [50] and Yakutsk R [57]. The data of LOFAR are labeled as: LOFAR R [59]. The curves correspond to model predictions for proton and iron primaries simulated by using different high-energy hadronic interaction models.
From Figure 8, it can be seen that the lowest energy data reported correspond to Yakutsk. They start at 10 16.5 eV. The Yakutsk data are compatible with a variable composition that seems to be be dominated by intermediate nuclei at the lowest energy and becomes lighter for increasing values of primary energy. The data of the different experiments are compatible with a light composition from ∼10 17.5 to ∼10 18.5 eV, where a transition toward heavier nuclei begins. This picture is compatible with the one inferred considering the mean value of X max in the limited energy range corresponding to σ[X max ] measurements.
There are several possibilities to study the composition of the cosmic rays from the X max data. From the measured X max distributions, it is possible to infer the abundance of different nuclei that are assumed to be present in the cosmic-ray energy spectrum at a given energy [64,67]. From the measured values of X max , it is possible to infer the mean value of the logarithmic mass number [33], which is defined as ln is the fraction of nuclei of mass number A at a given primary energy. Moreover, from the measured values of σ[X max ], it is possible to estimate the standard deviation of ln A [33]. All these analyses have to be conducted by using EAS simulations which, as mentioned before, can introduce important systematic uncertainties due to the different predictions obtained when different high-energy hadronic interaction models are considered.
Let us consider the estimation of ln A from X max data. The mean value of X max obtained from measurements involves the average over shower-to-shower fluctuation and also over the mass number, i.e., where P(X max |A, E) is the distribution function of X max for a given mass number A and primary energy E (see right panel of Figure 2). Equation (15) can also be written as, Taking the average over A in Equation (13) and using Equation (16), the following expression for ln A is obtained where it is used that the function F E in Equation (13) can be written as F E = ( X Fe max − X p max )/ ln (56). Figure 9 shows ln A as a function of the logarithmic energy obtained by using Equation (17). The values of X Fe max and X p max are obtained from the fits of the simulated EAS data described above. The calculation is performed for the three high-energy hadronic interaction models considered. From the figure, the dependence of the inferred ln A on the high-energy hadronic interaction models used to analyze the data is evident. In fact, the lightest composition is found when QGSJet-II.04 is used to analyze the data, and the heaviest composition is found for Sibyll2.3d. Figure 9 shows more clearly the trend followed by the composition inferred based on Figure 7. For all high-energy hadronic interaction models, a gradual increase in the composition from ∼10 16 eV, reaching a maximum value of ln A between 2.6 and 3, close to the expectation for nitrogen (A = 14), can be seen. At some point between ∼10 17 and ∼10 18 eV, a transition toward light nuclei starts. The minimum is reached at 10 18.3 -10 18.5 eV, where a new transition toward heavy elements starts. At energies of the order of ∼10 20 eV, the only available X max data are obtained from the surface detectors of Auger. This is due to the reduced duty cycle of the fluorescence and non-imaging Cherenkov techniques compared with the one corresponding to the surface detectors, which is ∼100%. Therefore, at ∼10 20 eV, these data suggest that ln A ∼ = 2 − 3 depending on the high-energy hadronic interaction model considered.
The change in composition at a break energy of 10 18.3 -10 18.5 eV (the break energy depends on the experimental data considered) has been discussed recently in Refs. [68,69]. In these analyses, it is shown that there is clear experimental evidence about a change in the elongation rate at a given break energy, as first reported by Auger [70]. Moreover, in Ref. [68], it is claimed that for energies below the break energy, the elongation rate obtained from the data of the northern and southern hemispheres are compatible, but the ones obtained considering data above the break energy are different. In contrast, in the subsequent analysis of Ref. [69], it is found that the elongation rates measured in the northern and southern hemispheres are compatible below and above the energy break.

Composition from Surface Detectors
As mentioned before, the composition can also be inferred from data taken by surface detectors that measure the secondary particles of the EAS that reach the ground. These composition analyses are based on a single parameter or on a combination of several parameters. The main advantage of the composition analyses based on surface detectors data is that the duty cycle of these types of detectors is ∼100% and then the number of events is about one order of magnitude larger than the one corresponding to the optical detectors. In general, it is difficult to compare the results of the analyses based on surface detectors data with the ones obtained from the mean value of the X max parameter due to the fact that ln A is not always reported and it is not directly obtained from the data as in the X max parameter case. In this paper, the results of the composition analyses in which ln A is reported and the ones corresponding to the measurements of the muon density are considered. The muon density measured by different experiments cannot be compared directly as it depends on the altitude of the observatory, the threshold energy of the muons, and the single distance to the shower axis or interval of distances to the shower axis considered for its calculation. One way to compare measurements of the muon density from different experiments is through the z-scale [71,72]. However, in this work, a different approach is followed. Assuming that the muon density, for a primary of mass number A, measured by the different experiments, follows Equation (11), i.e., ρ A µ = A 1−β ρ p µ , and that the mass number of the heaviest nuclei is A M , the ln A can be approximated by [73] ln where A M is taken as the minimum mass number such that the data points corresponding to a given muon density-related parameter are contained between the predictions for protons and the one corresponding to nuclei of mass number A M and Here, ∆/2 is the systematic uncertainty introduced by the use of the approximated expression in Equation (18). Figure 10 shows ln A as a function of the logarithmic primary energy obtained by using data from surface detectors. The experiments considered are Auger, IceCube, AGASA, and Telescope Array. In this case, the models considered are EPOS-LHC and QGSJet-II.04. The measurements considered are: • ln A obtained from the ∆ s parameter, measured by Auger, which is obtained from the risetimes of the signal collected by the water-Cherenkov detectors [74]. The data taken by the 750 and 1500 m arrays are considered. • ln A obtained from the parameter X µ max , measured by Auger, which corresponds to the atmospheric depth of the maximum of the muon production depth distribution [75]. It is obtained from the time traces measured by the water-Cherenkov detectors. • The parameter R µ , measured by Auger, is an estimator of the total number of muons of energy above 0.3 GeV, obtained from the data provided by the water-Cherenkov detectors [76]. The events considered correspond to inclined showers and are detected in hybrid mode. • The muon density at 450 m from the shower axis with a muon threshold energy around 1 GeV, measured by the Underground Muon Detectors (UMDs) of Auger [77]. • The density of GeV muons at 600 and 800 m from the shower axis measured by IceCube with the IceTop Array [78]. • The density of muons of energy above 0.5 GeV evaluated at 1000 m from the shower axis measured by AGASA [79,80] . • ln A obtained through a multiparametric analysis based on the Telescope Array surface detectors data [81]. Note that the ln A is reported for the QGSJet-II.04 and QGSJet-II.03 (and older versions of the QGSJet-II models) but not for EPOS-LHC.
Note that the energy scales used are the ones corresponding to each experiment, but for AGASA, the energy scale from the Spectrum Working Group is used [82]. The only systematic uncertainties included in the plot are the ones introduced by the use of Equation (18) to calculate the ln A from the measured muon density. From Figure 10, it can be seen that, in general, the ln A obtained from surface detectors data is incompatible with the one obtained by optical and radio detectors. Moreover, ln A obtained from the Auger UMD and R µ parameter are compatible with a composition dominated by nuclei heavier than iron, which is incompatible with any realistic astrophysical scenario. This is so for both EPOS-LHC and QGSJet-II.04. The ln A obtained from X µ max is also compatible with a composition dominated by nuclei heavier than iron when EPOS-LHC is considered to analyze the data. When QGSJet-II.04 is considered to interpret the X µ max data, the ln A is compatible with a composition dominated by iron nuclei or even lighter for smaller values of the primary energy. However, the values of ln A . Mean value of ln A as a function of the logarithmic energy obtained from experimental data taken by the surface detectors of different experiments by using two different high-energy hadronic interaction models to interpret the data. The Auger data are labeled as: Auger ∆ s 750 m array [74], Auger ∆ s 1500 m array [74], Auger X µ max [75], Auger R µ [76], and Auger UMD [77]. The IceCube data are labeled as: IceCube ρ µ (600 m) and IceCube ρ µ (800 m) [78]. The AGASA data are labeled as: AGASA [79,80]. The Telescope Array data are labeled as: TA [81]. The square brackets correspond to the systematic uncertainties introduced by the use of Equation (18) to calculate the ln A . obtained are incompatible with the ones obtained by using optical and radio detectors. The AGASA data are also compatible with a flux dominated by nuclei heavier than iron. The values of ln A obtained by using the ∆ s parameter are closer to the ones obtained by using optical and radio detectors, but they are still larger than those. The Telescope Array data are compatible with a light composition above 10 18 eV which is also incompatible with the optical and radio results and also with other analyses based on surface detectors data. The IceCube data are also in tension with the optical results, but in this energy range, there are also discrepancies between data obtained by different optical detectors.
The Auger water-Cherenkov detectors measure muons and also the electromagnetic particles of the showers. Close to the shower axis, the signal is dominated by the electromagnetic particles, and far from the shower axis, it is dominated by muons. Therefore, the ∆ s parameter is less affected by the muon component than the other parameters. Because the ln A obtained from the ∆ s parameter is smaller than the one obtained by using other parameters completely dominated by the muonic component of the showers, the incompatibility seems to originate in the number of muons predicted by the post-LHC high-energy hadronic interaction models. In fact, the much heavier composition inferred from muon density measurements suggests that post-LHC high-energy hadronic interaction models present a deficit in the muon component [83]. This muon deficit has been studied in detail by the Working Group on Hadronic Interaction and Shower Physics (WHISP). Combining muon data from several experiments, they find that the muon measurements are compatible with the post-LHC high-energy hadronic interaction models up to energies of a few 10 16 eV. Above that energy, the muon deficit increases with the logarithmic energy (see Ref. [84] for an updated analysis). The muon deficit can take large values at the highest energies, for instance, Auger found that for 10 18.8 eV ≤ E ≤ 10 19.2 eV, it is of the order of ∼33% for EPOS-LHC and ∼61% for QGSJet-II.04 [85]. There is also experimental evidence about the increase in the muon deficit with the zenith angle of the showers [85,86]. It is worth mentioning that the composition inferred from the muon measurements of the Yakutsk EAS array is compatible with the one inferred from the X max measurements [87]. This is so in the whole energy range of the Yakutsk EAS array, which includes the highest energies where the muon deficit found by other experiments is more important. Therefore, further studies of the biases and systematic uncertainties of each experiment are required in order to understand and solve the existing discrepancies.
It is worth mentioning that, despite the tension between the experimental data and the post-LHC high-energy hadronic interaction models, the trends in the change of the mass composition obtained from optical detectors (see Figure 9) are consistent with the ones obtained from surface detectors (see Figure 10).
In order to illustrate the tension between post-LHC high-energy hadronic interaction models and the experimental data, at the level of the composition-sensitive observables, Figure 11 shows the mean values of different parameters measured by using data taken by the UMDs (top panels) and the water-Cherenkov detectors (lower panels) of Auger as a function of the mean value of X max for different values of the primary energy and zenith angle. The curves correspond to the predictions obtained by using EPOS-LHC and QGSJet-II.04. From the figure, it can be seen that the experimental data are incompatible with the predictions of the high-energy hadronic interaction models considered. Note that the experimental points fall outside the curves even when the systematic uncertainties are considered. Moreover, without considering the analysis based on the X µ max parameter, which is not directly related with the muon content of the showers, it can be seen that the tension between high-energy hadronic interaction models considered and data can be relaxed by increasing the number of muons in the simulated showers. In the second knee region, KASCADE-Grande could separate the flux in light and heavy components measuring the shower particles with shielded and unshielded surface detectors [89,90]. Note that with these two types of surface detectors, it is possible to separate the muonic and the electromagnetic components. In this case, the primary energy is reconstructed by using simulated showers. The heavy component presents a knee-like feature at ∼10 16.7 -10 16.8 eV when the data are analyzed with EPOS-LHC, QGSJet-II.04, and Sibyll 2.3d [91]. The light component presents a flattening at ∼10 17 eV. At this energy, the flux is dominated by the heavy component. If the cosmic ray flux in the knee region is dominated by protons, the iron knee should take place at E knee Adapted from [77]. Bottom left panel: Mean value of the natural logarithm of R µ as a function of the mean value of X max measured by the Auger water-Cherenkov detectors. Adapted from [76]. Bottom right panel: Mean value of X µ max parameter as a function of the mean value of X max measured by the Auger water-Cherenkov detectors. Adapted from [88]. The curves correspond to the predictions obtained for EPOS-LHC and QGSJet-II.04. extragalactic cosmic rays [90]. In this case, this feature is formed by the superposition of a galactic flux with the extragalactic one which starts to be dominant above ∼10 17 eV.
The increase in the ln A measured by TALE above 10 17 eV (see Figure 9) is in tension with this interpretation. Additionally, the iron fraction measured by TALE [64] does not show any evidence about the knee in the heavy component observed by KASCADE-Grande. Moreover, the ln A measured by IceTop in combination with IceCube also increases above 10 17 eV [7]. Moreover, in this case, the iron fraction does not show any change below 10 17 eV.
The discrepancies found between data taken by surface detectors in the second knee region can be due to the muon deficit of the simulated showers. As mentioned before, the tension among data taken by optical detectors in this region of the spectrum suggest the existence of large systematic uncertainties. More studies are required to understand the differences found.

Combined Analyses
The results of a study performed combining data taken by the fluorescence telescopes and the water-Cherenkov detectors of Auger were recently published in Ref. [92]. The primary energy of the events considered in the analysis ranges from 10 18.5 to 10 19 eV, and the parameters considered are X max and the signal deposited in the water-Cherenkov detectors at 1000 m from the shower axis, S(1000). Assuming that the composition of the cosmic rays is independent of the zenith angle, it was found that to alleviate the tension between the Auger data and EAS simulations, in addition to the increase in the number of muons of the simulated showers, a shift in the simulated X max parameter is also required. If this result is confirmed in future studies, it would imply that the muon deficit alone cannot explain the discrepancies between current high-energy hadronic interaction models and experimental data.
A scaled version of the same parameters has been considered by Auger to study the purity of the cosmic rays in the energy range from 10 18.5 to 10 19 eV. In this case, the correlation between the scaled version of the parameters is used to study the composition [61,93]. For energies between 10 18.5 and 10 18.7 eV, a pure composition is excluded with a significance larger than 6.4 σ. Moreover, the data can be explained only by a mixture of primary nuclei of a mass number heavier than helium, i.e., pure composition, and proton-helium mixtures are disfavored by the data. An important feature of this analysis is that it is nearly independent on the high-energy hadronic interaction models used to analyze the experimental data. Note that this result is consistent with the composition inferred from the X max parameter alone, measured by Auger.
There are two main interpretations about the formation of the ankle and the origin of the light component that dominates the flux between 10 18 and 10 18.5 eV. In the first scenario [94,95], the light component below the ankle originates in a different population of sources than the one that dominates the flux above the ankle, which includes heavier nuclei according to the Auger data. In the second scenario, the light component originates from the photodisintegration of high-energy and heavier nuclei in a photon field present in the sources or its environment [96][97][98][99][100][101]. In these two scenarios, the transition between the galactic and extragalactic components takes place in the second knee region.

High-Energy Photon and Neutrino Searches
Ultra-high-energy photons and neutrinos carry very important information about the cosmic-ray accelerators and the propagation of the cosmic rays in the intergalactic medium. In particular, its arrival direction points back to the source, as they are neutral particles and then they are not deflected by the galactic and extragalactic magnetic fields. Moreover, knowing the photon flux level is very important for an accurate energy calibration of surface detectors with fluorescence telescopes (see, for instance, Ref. [102]).
As mentioned before, a flux of high-energy photons and neutrinos is expected due to the interaction of the extragalactic cosmic rays with the radiation field present in the intergalactic medium. Nuclei of ultra-high energies can interact with the low-energy photons of the extragalactic background light, the cosmic microwave background, and the radio background. Neutrinos are produced in pion decays, generated through photo-pion production and nuclear decay. The main decay channels of charged pions are: π + → µ + + ν µ → e + + ν e +ν µ + ν µ and π − → µ − +ν µ → e − +ν e + ν µ +ν µ . Note that the muon decay also contributes to the neutrino generation. The expected flavor ratio at Earth is 1 : 1 : 1 due to neutrino oscillations. Photons are produced in neutral pion decays (π 0 → γ + γ), generated in photo-pion production, and also due to the inverse Compton of high-energy electrons with low-energy photons of the background (e ± + γ b → e ± + γ). Note that high-energy electrons are generated by the pair production of high-energy nuclei with the low-energy photons of the background. They are also created in the beta decay of nuclei and the decay of muons from the decay of charged pions. The high-energy photons and electrons form an electromagnetic cascade in the intergalactic medium.
High-energy photons arriving at Earth generate EASs that are almost electromagnetic. They are characterized by having a much deeper X max and very small muon content. For this reason, it is easier to separate hadrons from photons than heavy from light hadrons. At present, there are only upper limits to the integrated photon flux obtained by different experiments. The most restrictive ones are obtained by Auger and Telescope Array, which are shown in Figure 12. The figure also shows the integrated photon flux corresponding to two extreme models in which the ultra-high-energy cosmic-ray flux is dominated by protons and by iron nuclei [14]. From the figure, it can be seen that the upper limits are reaching the integrated flux values corresponding to the proton model. Upper limits on the integral photon flux at 95% confidence level obtained by Auger [103,104] and Telescope Array [105,106]. The shaded regions correspond two models in which the ultra-high-energy cosmic rays are dominated by protons and iron nuclei [14]. The dashed-dotted curve corresponds to the integral photon flux originated by the decay of hypothetical super-heavy dark matter particles of mass M χ = 10 10 GeV/c 2 and lifetime τ χ = 3 × 10 23 year [107]. Adapted from [104].
Photon searches at the highest energies have been motivated in the past by the top-down models, in which the ultra-high-energy cosmic rays originate in the decay of super-heavy relic particles present in the halo of our Galaxy or by the decay of topological defects (see Ref. [108]). In these models, the ultra-high-energy cosmic-ray flux is dominated by photons. These top-down models are disfavored by current data, but it is still possible that a minority component contributes to the total flux. Figure 12 shows the high-energy integral photon flux due to the decay of super-heavy dark matter particles, located in the halo of our Galaxy, of mass M χ = 10 10 GeV/c 2 (c is the speed of light) and lifetime τ χ = 3 × 10 23 year [107]. From the figure, it can be seen that such a model is still compatible with the most stringent upper limits found up to now.
The neutrino search at the highest energies is conducted by using different experimental techniques. There are two main methods to search for neutrinos in cosmic-ray observatories. The first is based on the fact that the neutrino cross section is so small that it is more probable that a neutrino initiates a shower close to the horizontal direction and very deep in the atmosphere. One of these showers is very easy to identify, as hadronic or even electromagnetic showers develop significantly earlier in the atmosphere. The other method is based on the fact that tau neutrinos that propagate skimming the Earth, of zenith angles larger than 90 • , can interact in the Earth and produce a tau lepton, which can decay in the atmosphere, initiating a shower. These showers are also easy to identify because they develop in the up-going direction.
There are also dedicated neutrino observatories, such as IceCube, which is a cubic-kilometer particle detector made of Antarctic ice. It is buried beneath the surface, reaching a depth of about 2.5 km. In this case, neutrinos are detected when they interact with the molecules of the ice, producing relativistic charged particles, which emit Cherenkov radiation measured with optical modules. These optical modules are placed in a hexagonal grid of 85 strings starting at 1450 m and reaching 2450 m depth. On the other hand, ANITA consists of an array of radio antennas installed on a balloon flying at a height of ∼37 km in Antarctica. ANITA was designed to measure impulsive radio emissions from a neutrinos-initiated cascade in the Antarctic ice. It can also observe extensive air showers induced by cosmic rays or other particles. Figure 13 shows the upper limits on the differential neutrino flux, at a 90% confidence level, obtained by IceCube [109], Auger [110], and ANITA [111], which are the most restrictive. In the figure, the expected neutrino flux for the same models considered before can also be seen. Note that the IceCube and Auger upper limits are reaching the flux level of the proton models. 17 18 19 20 21 log(E/eV)  Figure 13. Upper limits on the differential total neutrino flux at 90% confidence level obtained by IceCube [109], Auger [110], and ANITA [111]. The shaded regions correspond to two models (same as the ones in Figure 12) in which the ultra-high-energy cosmic rays are dominated by protons and iron nuclei [14].
As mentioned in the introduction, the composition at the highest energies is a crucial information for predicting the ultra-high-energy photon and neutrino fluxes. Another important parameter is the maximum energies reached by the cosmic rays injected by the sources. The models that best fit the Auger flux and composition predict very low neutrino and photon fluxes [112]. However, the neutrino flux predicted by the models that best fit the Telescope Array flux and composition is larger [113]. This is mainly due to the larger values of the fitted maximum energy and also, but to a lesser extent, to the fitted lighter composition. In any case, more statistics in the suppression region are required to obtain a more accurate fit of the data.
Additionally, the ultra-high-energy photons and neutrinos fluxes can constrain the composition of the ultra-high-energy cosmic rays independently of the high-energy hadronic interaction models [21,22]. As can be seen from Figures 12 and 13, the neutrino and photon fluxes in models dominated by light nuclei are larger than the ones dominated by heavy nuclei. This is mainly due to the fact that the threshold energy of the photo-pion production increases with the mass number and, because the injection spectrum decreases with the cosmic ray energy, there are less particles in heavy nuclei dominated models that undergo photo-pion production.

Future Perspectives on Composition
The data considered in previous sections were taken by detectors that are currently in operation. These experiments will continue taking data and refining the methods used in composition analyses. An example of the progress in the methods used to study the composition is the increasing use of machine learning techniques [7,81,114,115], which have proven to be powerful tools for reconstructing composition sensitive parameters and also in the composition determination.
Some of the current observatories will be or are being upgraded. In particular, Auger started an upgrade of the observatory, which has the detailed study of the composition from the transition region up to the highest energies as one of its main objectives. The Auger upgrade is known as AugerPrime [16,116]. The improvements relevant for composition are: the addition of plastic scintillation detectors on top of the water-Cherenkov detectors to separate the electromagnetic from the muonic components of the showers; the addition of underground muon detectors in the infill region to measure the muonic component of the showers in the transition region; the extension of the current fluorescence telescopes measurements into periods of higher night-sky background in order to increase their duty cycle; and the addition of a radio detector to each water-Cherenkov station which can measure X max and other compositionsensitive parameters. These enhancements will increase the mass composition sensitivity of the observatory. One of the main goals of the upgrade is to measure the composition at the highest energies, in the suppression region, by using the upgraded surface detectors, which can collect about one order of magnitude more statistics than the fluorescence detectors.
Additionally, Telescope Array started an upgrade of the observatory called TAx4 [117]. It consists of an increase in the detection area by adding new surface detectors and fluorescence telescopes. The detection area increased from ∼700 to ∼2800 km 2 . With this new detection area, it will be possible to measure the composition from the northern hemisphere by using both surface and fluorescence detectors with much larger statistics.
IceCube is also planning an upgrade called the IceCube-Upgrade [118]. It will consist of an enhancement of the IceTop surface detectors by adding scintillation detectors, radio detectors, and possibly small non-image Cherenkov telescopes. One of the main goals of this upgrade is also to increase the mass composition sensitivity of the observatory.
Next generation observatories are currently being planned. In particular, IceCube-Gen2 [119], the successor of IceCube, will measure cosmic rays at low energies. It will consist of an optical array in the deep ice, a large-scale radio array, and a surface detector above the optical array. It will be able to measure composition with more sensitivity and it will increase its maximum energy up to ∼10 18 eV. At the highest energies, the POEMMA project is designed to measure ultra-high-energy cosmic rays and neutrinos from the space [120]. It will consist of two identical orbital fluorescence telescopes being able to measure the X max parameter at the highest energies with unprecedented statistics. The GRAND project [121] is designed to detect ultra-high-energy neutrinos, cosmic rays, and gamma rays by using the radio technique. It will consist of 2 × 10 5 radio antennas covering an area of 2 × 10 5 km 2 , which will be separated into 20 sub-arrays of ∼10 4 km 2 . It will be able to measure the X max parameter in a wide energy range with large statistics. Finally, GCOS is a starting project intended to design a next-generation observatory with an aperture of at least one order of magnitude larger than the one corresponding to the observatories that are currently in operation [122]. One of its main objectives is to improve the mass resolution, based on the experience obtained from AugerPrime and TAx4.
The data collected by the planned upgrades and also by the future observatories will constrain even more the high-energy hadronic models used in shower simulations. Moreover, analyses like the one reported in Ref. [123], where the fluctuations of the R µ parameter are studied and compared with model predictions, can contribute to understand more deeply the discrepancies found. Moreover, the data from the high-luminosity LHC [124] run will play an important role on the improvement of the current high-energy hadronic interaction models.

Conclusions
The origin of cosmic rays is still an open issue in high-energy astrophysics. The composition of the primary particle is key information to understanding this phenomenon. In the last years, a big effort has been made to determine it. The main limitation comes from the incompatibility of current high-energy hadronic interaction models, used to simulate the EAS required to determine the primary mass, with the experimental data.
The determination of the composition is mainly achieved by using optical detectors that measure the depth of the shower maximum. In particular, the mean value of X max measured by different experiments falls between the proton and iron expectations obtained by using current high-energy hadronic interaction models. There are also several composition parameters that come from the surface detector of which the most sensitive to the nature of the primary is the muon content of the showers. In general, the composition obtained from parameters measured by surface detectors is incompatible with the one obtained from X max . These discrepancies are larger at the highest energies, where the composition obtained by some of the parameters falls above the expectations for iron nuclei, obtained by using current models. These incompatibilities are usually interpreted as a deficit in the number of muons predicted by current high-energy hadronic interaction models. Even though the hadronic part of the simulated cascades seems to be the source of the discrepancies, the electromagnetic part, which affects the X max predictions, can also play an important role as showed by recent analyses.
In the energy interval between 10 15 and 10 18 eV, the composition obtained from the X max measured by different experiments presents large differences. The composition measurement in this energy range is very important for understanding the nature of the second knee, which can be interpreted as the region where the transition between the galactic and extragalactic component takes place, as suggested by the KASCADE-Grande data. The IceCube data cannot corroborate this hypothesis, but this can be due to the systematic uncertainties introduced by current high-energy hadronic interaction models used to analyze the experimental data, which were showed to be more important in composition analyses based on surface detectors information.
Above ∼10 18 eV, the composition measurements obtained by different experiments that measured the X max are compatible within statistical and systematic uncertainties. The composition seems to be light between 10 18 and 10 18.5 eV. At higher energies, the composition is compatible with an increasingly heavier average mass as a function of primary energy. In this energy region, the discrepancies between the composition obtained from the X max parameter and the muon component of the showers are the largest.
The importance of the composition information on many aspects of cosmic ray studies has become more relevant in recent years. This motivated several upgrades of current observatories with a special interest in the increase in the mass sensitivity of the detectors. Moreover, the next generation of cosmic-ray observatories are also being designed, taking into account the importance of the mass composition determination. Moreover, the planned, much larger detection areas in combination with the enhanced mass composition sensitivity of future detectors will allow to constrain or even measure the photon and neutrino fluxes at the highest energies, which can be used to constrain the composition in the suppression region.
An improvement on current high-energy hadronic interaction models is required to make progress on the composition determination. This can be achieved in the near future, based on the synergy between the EAS studies and the ones conducted in particle physics, especially with the future high-luminosity LHC data.