Understanding the transverse-spherocity biased data from pp collisions at the LHC energies

The ALICE collaboration recently reported the mean transverse momentum as a function of charged-particle multiplicity for different pp-collisions classes defined based on the"jettiness"of the event. The event"jettiness"is quantified using transverse spherocity that is measured at midpseudorapidity ($|\eta|<0.8$) considering charged particles with transverse momentum within $0.15<p_{\rm T}<10$ GeV/$c$. Comparisons to PYTHIA 8 (tune Monash) predictions show a notable disagreement between the event generator and data for jetty events that increases as a function of charged-particle multiplicity. This paper reports on the origin of such a disagreement using PYTHIA 8 event generator. Since at intermediate and high $p_{\rm T}$ ($2<p_{\rm T}<10$ GeV/$c$), the spectral shape is expected to be modified by color reconnection or jets, their effects on the average $p_{\rm T}$ are studied. The results indicate that the origin of the discrepancy is the overpredicted multijet yield by PYTHIA 8 which increases with the charged particle multiplicity. This finding is important to understand the way transverse spherocity and multiplicity bias the pp collisions, and how well models like PYTHIA 8 reproduce those biases. The studies are pertinent since transverse spherocity is currently used as an event classifier by experiments at the LHC.


Introduction
Quantum Chromodynamics (QCD) predicts that at very high-energy densities, ordinary nuclear matter undergoes a crossover transition to primordial hot QCD matter (deconfined quarks and gluons).This state of matter, called quark-gluon plasma (QGP), existed a few microseconds after the Big Bang [1,2].The QGP can be recreated in the laboratory through high-energy heavy-ion collisions [3,4].For this purpose, heavy ions have been collided at ultra-relativistic center-of-mass energies per nucleon pair at RHIC and LHC.LHC data support the formation of a medium with a lower-bound energy density between 10 and 20 GeV/(fm 2 c) and effective temperature of almost 300 MeV [5,6].This fireball has shown to be a strongly-interacting system of quarks and gluons with very low viscosity-to-entropy ratio (nearly an ideal fluid) that presents a hydrodynamic behaviour [7,8].Once created, the system expands and cools down very fast with a characteristic decoupling time of approximately 10 fm/c [9].Among the observables used to study the QGP, event-by-event fluctuations like the number of charged particles (multiplicity) or the mean transverse momentum (⟨p T ⟩) are important to understand the dynamical variations associated to the formation of the medium [10].The collective expansion of the system is responsible for the shape of the p T spectra at low-and intermediate-transverse momentum (p T ≲ 4 GeV/c) [11].Above this threshold, the distributions are a consequence of the initial hard scatterings in the collisions.On top of this, jet quenching originated by the energy loss of partons traversing the dense medium created by the collision, remains as a key observable in the study of QGP [12].
Before the start of the LHC, proton-proton (pp) and proton-lead collisions (p-Pb), the so-called small collision systems, were simply treated as control experiments.However, one of the most unexpected results in the high-energy physics area in the last 15 year has been the discovery of QGP-like effects in high-multiplicity pp and p-Pb interactions.It all started with the observation of a long-range, near-side dihadron correlation (ridge structure) in pp collisions and afterwards also discovered in p-Pb collisions [13,14].Since then, more and more collective effects in small systems have been unveiled [15].The physics beneath these observations have been part of an intense debate.Several theoretical approaches have been suggested to explain the QGP-like effects in small-collision systems.For example, in the range of applicability the color-glass condensate effective field theory, the flow-like behavior can be produced in the early stages of the collision [16,17].Alternatively, the effects could develop during the collective evolution, where hydrodynamics is applicable [18,19,20].Other approaches implemented in Monte Carlo event generators like PYTHIA 8 employ effects such as color reconnection or rope hadronization to perform a microscopic description of the system [21,22].Indeed, it has been shown that such models and implementations in PYTHIA 8 can produce flow-like effects in pp collisions [23].
Charged particle multiplicities (N ch ) and p T distributions have been extensively studied by experiments at the LHC and at RHIC in small systems [24,25,26,27,28,29,30].The data indicate a clear increase of the ⟨p T ⟩ with increasing multiplicity.On one hand, this phenomenon is described in PYTHIA 8 by multiparton interactions (MPI) and allowing the interaction among partons before the hadronization via color strings (color reconnection), thus hardening the p T spectra at intermediate p T but decreasing the average multiplicity [31].On the other hand, in EPOS LHC, an event generator featuring core-corona effect, the increase of the ⟨p T ⟩ is determined by the collective expansion of the system [32].
One issue when pp collisions are analyzed as a function of the charged particle multiplicity (measured in a narrow pseudorapidity interval) is that high-multiplicity events are biased towards multijet final states.The effect is illustrated when the p T spectra of high-multiplicity pp collisions is normalized to the analogous quantity measured in minimum-bias pp collisions.The ratio shows a continuous rise with increasing p T and it gets steeper for larger multiplicity values [33].One way to mitigate the bias was proposed some years ago, the idea consists in measuring the "jettiness" of the event using event shape observables like transverse spherocity [34,35,36].Since then, different measurements have been reported using event shape selections [37,38].The ALICE collaboration reported the first measurement of the ⟨p T ⟩ as a function of multiplicity for different spherocity classes [33].While results for minimum-bias and high-spherocity pp collisions (isotropic events) were well described by PYTHIA 8 tune Monash, the agreement was broken for low-spherocity pp collisions (jetty events).This was a surprise since Monash was obtained from a tune to LHC data, and therefore is known to describe several observables of unidentified charged particles [39].In this paper, the origin of this discrepancy is studied focusing on color reconnection as it is known to modify the p T -spectral shape at intermediate p T (2 < p T < 4 GeV/c).Since in this p T range, the transition between soft and hard processes occurs, the impact of color reconnection and jets is also explored.Another motivation is the bias of the high-multiplicity pp sample towards multijet final states.
This paper is organized as follows: section 2 provides a brief description of event shapes and Monte Carlo event generator.Section 3 explores the origin of the difference between data and PYTHIA 8, as well as the impact of jets to reconcile the event generator and data.Finally section 4 summarizes the conclusions.

Spherocity and PYTHIA 8 event generators
Event shapes are sensitive to the spacial distribution of particles produced in a collision.They have been extensively used to characterize QCD in electron-positron collisions, as for example in the extraction of the strong-coupling constant, understanding hadronization process or even in the parton shower tuning in event generators [40].In hadronic interactions, event shapes are restricted to the transverse plane relative to the beam direction, making the observables insensitive to the boost along the beam direction [41].
Among the event shapes currently in use, transverse spherocity (S 0 ) has shown to be a good tool to classify the high-multiplicity pp collisions as either multijet final states (jetty) or uniform particle emission (isotropic) [37].Jetty events are associated to hard partonic scaterings while isotropic events are related to collisions in which several semi-hard parton-parton scatterings occurs within the same pp collision.Transverse spherocity, from now on simply called spherocity, is defined relative to a unit vector (n) that minimizes the ratio: Spherocity is a normalized quantity, and as a consequence has extreme values 0 and 1, corresponding to jetty and isotropic events, respectively.Since spherocity is implicitly multiplicity dependent, in order to disentangle the multiplicity from the event shape effects, the analysis has to be double differential.For a fixed multiplicity value, the multiplicity effect gets factorized and therefore any modification on particle production can be attributed to the event-shape selection.Nonetheless, selecting pp collisions with high charged-particle multiplicity biases the sample affecting different observables.For instance, the neutral-to-charged kaon ratio is known to decrease when the midrapidity charged-particle multiplicity increases [42].In PYTHIA 8, events with isotropic distribution of particles are associated to high underlying-event (UE) activity and therefore with a large number of MPI.Contrarily, the UE activity decreases when spherocity is reduced.
As different physics aspects can not be extracted from theoretical grounds, event generators rely on some sets of tuned parameters to enhance the predictive power.In this paper the tune Monash, a set of parameters extracted from e + e − and pp data, is used in the PYTHIA 8 simulations [43,39].During the developing parton  shower of a hadronic collision, quarks and gluons are connected by colored strings.Originally, string models were based on the leading color approximation where generated partons were colored connected only to their parent emitters.In this sense, the products from different MPIs were kept independent.Color reconnection (CR) allows the interaction among partons from originally non-correlated MPIs, implying a far richer colored topology than the original leading color method [44].The MPI-based color reconnection model implemented in PYTHIA 8 Monash sets the probability of a low-p T parton to be merged with a higher-p T one, in such a way that the total string length gets minimized.The CR mechanism is governed by a parameter called reconnection range (RR), with a value of 1.8 for the Monash tune.

Average transverse momentum as a function of multiplicity and spherocity
In order to compare with available ALICE data, only pp collisions producing at least one charged particle with p T > 0 in the pseudorapidity interval |η| < 1 are used, this particular selection is called INEL>0.Furthermore, particles are required to be promptly produced in the collision, including all decay products excluding those from weak decays.Following the experimental definition of spherocity, the minimum number of particles in the selected event must be at least three and a restricted p T range of 0.15 to 10 GeV/c is demanded [33].With this information it is possible to compute a S 0 distribution for each event multiplicity and derive the S 0 percentiles corresponding to each N ch interval.Events falling within the 0-10% spherocity class are labeled as jetty, while those in the opposite extreme range of 90-100% are the istropic events.Finally, the ⟨p T ⟩ is computed per event multiplicity in each S 0 class.

Color reconnection effects
Figure 1 presents the ⟨p T ⟩ as a function of dN ch /dη for isotropic and jetty events.Data are compared with PYTHIA 8 Monash and EPOS LHC predictions [32], the results are fully consistent with those reported in Ref. [33].For isotropic events both event generators can correctly reproduce the data when dN ch /dη > 12.For jetty events only EPOS LHC can describe the data within the full multiplicity range.Indeed, PYTHIA 8 Monash completely falls away from the systematic uncertainties.Furthermore, the data-to-model ratio shows a minimum at dN ch /dη ≈ 10 followed by a maximum and then a fast increasing divergence.It is worth mentioning that EPOS LHC is based on the Parton-Based Gribov-Regge Theory, where nucleon-nucleon collisions are addressed at the parton level via Pomeron exchanges that generate a parton ladder [32].These are in turn considered as flux tubes or strings that can decay via the emission of quark-antiquark pairs.EPOS LHC implements a core-corona effect, where core region presents a larger density of strings relative to the corona.The event generator is tuned to reproduce the collective effects observed in the small systems at the LHC.
In order to understand the disagreement between PYTHIA 8 Monash and data for jetty events, the contribution of particles within different p T intervals to ⟨p T ⟩ is studied.Figure 2 presents the ⟨p T ⟩ as a function of dN ch /dη for isotropic and jetty events.The contribution of particles with transverse momentum within 0.15 < p T < 2 GeV/c (low-p T particles), 2 < p T < 4 GeV/c (intermediate-p T particles) and 4 < p T < 10 GeV/c (high-p T particles) is shown.Isotropic events are fully dominated by low-p T particles up to dN ch /dη ≈ 30.As a result, below this threshold the shape of ⟨p T ⟩ closely resembles the shape dictated by the most inclusive ⟨p T ⟩.For high-multiplicity events the increase of the ⟨p T ⟩ is influenced by intermediate-p T particles.Jetty events show a completely different behavior.Up to dN ch /dη ≈ 30, the shape of the most inclusive ⟨p T ⟩ is mostly due to low-and intermediate-p T particles, but there is also a non-negligible contribution from high-p T particles.Finally, the fast increase of the ⟨p T ⟩ in the high-multiplicity regime (the third rise of the average p T as a function of multiplicity) is mainly driven by high-p T particles.
Since CR is known to modify the p T spectral shape, different CR models were tested.The first one employs a new CR method available in PYTHIA 8 that is based on QCD rules to determine the string length minimization.This model allows the formation of topological structures (junctions) when three colored strings meet at a single point.This implies that baryon production is enhanced with respect to the default CR approach [45].The simulation with this method is labeled as PYTHIA 8 Monash (QCD-based CR) in Fig. 3.The agreement with data in isotropic events gets worst in particular for dN ch /dη < 25.For jetty events there is basically no difference relative to PYTHIA 8 Monash.The second simulation is also based on PYTHIA 8 Monash but with a reconnection range value of 1.4, slightly smaller than the default one.This is labeled as PYTHIA 8 Monash (RR = 1.4) in Fig. 3. Lowering the RR reduces the reconnection probability, increasing the particle multiplicity and decreasing the p T of the emitted particles.As a consequence, the average transverse momentum of the event is also reduced.Results confirm this statement both for isotropic and jetty events, ⟨p T ⟩ from PYTHIA 8 Monash (RR = 1.4) is systematically below the prediction from PYTHIA 8 Monash (RR = 1.8).This slight modification in the RR greatly improves the agreement with data in jetty events, however, the third rise of ⟨p T ⟩ is till observed.For isotropic events the ⟨p T ⟩ predicted by PYTHIA 8 Monash (RR = 1.4) is worst in the full multiplicity interval relative to the default case.
Figure 4 shows the probability density of charged-particle multiplicity for PYTHIA 8 Monash with the reconnection rage values RR = 1.4 and RR = 1.8 (default).As mentioned above, a variation of RR induces a modification of the particle production.As a matter of fact, both sets of simulations have the same results for dN ch /dη ≲ 25.Above this value, setting RR = 1.4 over predicts the number of high-multiplicity pp collisions by a maximum of ∼ 20% relative to RR = 1.8 at dN ch /dη = 50.

Impact of jets on mean p T
Modification of RR improves the agreement between PYTHIA 8 and data for jetty events, and it affects the isotropic ones.Besides that, as already stated, the third increase in ⟨p T ⟩ at high-multiplicity (dN ch /dη > 30) is still observed even varying the RR parameters.We therefore also studied the impact of jets in the discrepancy 4   between PYTHIA 8 Monash and data because jets are expected to modify the spectral shape at high p T (4 < p T < 10 GeV/c).Bear in mind that CR effects on jet observables are expected to be negligible, because for example in the MPI-based CR model is easy to merge a low-p T system with any other, but difficult to merge two high-p T ones with each other [46].
A recent publication has reported the inclusive charged-particle jet differential cross-section as a function of the jet-p T for different jet radius (R) in pp collisions at √ s = 13 TeV for two configurations: with and without background subtraction [47].For both set of results and small values of R, PYTHIA 8 Monash overestimates the minimum-bias data up to 40% at p T,jet ≈ 10 GeV/c and by 10% at very high-p T,jet .The discrepancy is even larger if R is increased.This is a clear indication that PYTHIA 8 Monash overestimates the jet yield.Moreover, the discrepancy gets worst with increasing charged particle multiplicity.This could in turn explain the overpredicted ⟨p T ⟩ for large dN ch /dη in jetty events.A procedure aimed at quantifying the effect of such jet excess on ⟨p T ⟩ is followed.
PYTHIA 8 Monash simulations are employed using similar event and track selection described above with some modifications: no upper limit for particle's transverse momentum and |η| < 0.9.FastJet is used as jetfinder with anti-k T recombination algorithm [48].Transverse momentum for jets is calculated with a boosted invariant p T recombination scheme and |η jet | < 0.9 − R (in this case R = 0.2 is used).The jet-p T spectrum measured by ALICE is normalized to the corresponding MC prediction.For p T,jet > 5 GeV/c, a survival probability given by this ratio is assigned to each jet-p T interval.At this point two different selections are applied.The first one keeps events with at least one surviving jet (loose condition) and the second one requires events with all their jets surviving the selection (tight condition).Figure 5 shows the fraction of accepted events applying the survival probability to the leading jet (loose selection), and to all jets in the event (tight selection).Since low-multiplicity events are associated to soft physics like diffractive events, the hard selection removes most of the events for dN ch /dη < 5.This is not an issue for our discussion because we are interested in the impact of jets at high multiplicities.The rejection factor is higher for the tight selection than for the loose selection at high multiplicity.
The selected events are in turn used to compute the ⟨p T ⟩ as a function of dN ch /dη in the two spherocity classes.As the "excess" of jets has been removed, it is expected that jetty events become depleted in the S 0 distributions shifting the average spherocity towards values closer to one.Figure 6 presents the spherocity distributions for low (left), intermediate (center ) and high (right) multiplicities considering INEL>0 and those surviving the loose and tight selections.Only results considering PYTHIA 8 Monash are shown.For quantitative comparison, the mean values for each S 0 distributions are also displayed.These plots confirm the expected behaviour already stated: jetty events are now depleted towards larger S 0 values.Furthermore, the effect is more pronounced as the multiplicity increases.
Figure 7 presents ⟨p T ⟩ as a function of dN ch /dη in pp collisions at √ s = 13 TeV for isotropic and jetty events.PYTHIA 8 Monash results including INEL>0 collisions are compared with results obtained after applying the loose and tight selections.The prediction for isotropic events remains completely unaffected for both loose and tight selections, while for jetty events there is a noticeable difference.Regarding the loose selection, the result is closer to data as compared to INEL>0 for dN chy /dη < 30, above this threshold both results are basically the same.Regarding the tight selection, the average p T as a function of multiplicity is modified in the full multiplicity interval.Indeed, with this selection the model is able to nicely describe data within systematic  uncertainties.These results suggest that for PYTHIA 8 Monash to describe the data, the high-multiplicity regime should be dominated by minijet topologies rather than multijet final states.A similar conclusion was reached applying a Machine Learning technique to data, which suggested that in data multiparton interactions are more relevant than in MC to produce high multiplicities [49].In other words, the bias towards hard physics is not the same in data and MC.

Conclusion
A study of the ⟨p T ⟩ as a function of charged particle multiplicity and spherocity has been presented.The main goal is to understand the origin of the discrepancy between data and PYTHIA 8 Monash for pp collisions with spherocity close to zero (jetty events).There, the average p T in PYTHIA 8 Monash exhibits a steep increase with increasing multiplicity (dN ch /dη > 30) that is not seen in data.This effect is called third increase of the average p T with multiplicity.This paper reports that the overestimate from PYTHIA 8 Monash at high-charged-particle multiplicity is mainly due to to high-p T particles (4 < p T < 10 GeV/c).Different color reconnection models were tested, as well as the impact of jets.
• Regarding color reconnection, we show that slightly decreasing the reconnection range parameter notably reduces the discrepancy, improving the agreement for jetty events but still keeping the third increase of the average p T with multiplicity.The reduction affects the results for isotropic pp collisions in the full measured multiplicity range.Other predictions like multiplicity distributions are still in agreement with data within 20%.Results from a new color reconnection model based on QCD rules yield same predictions as PYTHIA 8 Monash for both jetty and isotropic events.
• Regarding the impact of jets at high multiplicity, PYTHIA 8 Monash is known to overestimate the jet production, in particular at high multiplicities.Based on comparisons between jet yields measured in INEL>0 collisions, the jet excess in PYTHIA 8 Monash relative to data was estimated and used to get a rough estimate of the potential impact of this discrepancy on ⟨p T ⟩.We define a survival probability of the event based on a data-motivated selection criterion applied to the leading jet.With this implementation, PYTHIA 8 Monash keeps the very good description of the data in isotropic events but it reconciles the simulation with experimental measurements for dN ch /dη ≲ 30 in jetty events.If the selectivity in PYTHIA 8 Monash is applied to both leading and subleading jets, the agreement between data and PYTHIA 8 Monash for jetty events gets significantly improved in the full multiplicity interval.The results suggest that the third rise of the average p T for dN ch /dη > 30 in PYTHIA 8 Monash can be attributed to the presence of multijet topologies.The implication is that, in data, high multiplicities may be dominated by minijet topologies (MPI) rather than by multijet final states.

Acknowledgments
Support for this work has been received from CONAHCyT under the Grants CB No. A1-S-22917, A1-S-21560 and CF No. 2042.

Figure 1 :
Figure 1: Average transverse momentum as a function of dN ch /dη for pp collisions at √ s = 13 TeV for two different spherocity classes: isotropic (left) and jetty (right).Black points correspond to data and error bars are the associated systematic uncertainties.Data are compared to two different Monte Carlo predictions (solid lines).Green line is the result from PYTHIA 8 Monash, while blue line is the result from EPOS LHC.Bottom panel presents the data-to-model ratio, where the shaded area around unity is the systematic uncertainty.

Figure 2 :
Figure 2: Average p T as a function of dN ch /dη for pp collisions at √ s = 13 TeV for two different spherocity classes: isotropic (left) and jetty (right).Black line is the p T -integrated distribution, green area is for low-p T particles with 0.15 < p T < 2 GeV/c, blue for intermediate-p T particles with 2 < p T < 4 GeV/c, and red area for high-p T particles with 4 < p T < 10 GeV/c.

Figure 3 :
Figure 3: Average p T as a function of dN ch /dη for pp collisions at √ s = 13 TeV for two different spherocity classes: isotropic (left) and jetty (right).Black markers correspond to data and error bars are the associated systematic uncertainties.Data are compared to three different Monte Carlo predictions (solid lines).Green line is the result from PYTHIA 8 Monash (default RR = 1.8), purple represents the PYTHIA 8 color reconnection based on a QCD model, and orange line is PYTHIA 8 Monash (RR = 1.4).Bottom panel presents the data to model ratio, where the shaded area around unity is the systematic uncertainty.

Figure 4 :Figure 5 :
Figure 4: Probability density of charged-particle multiplicity for PYTHIA 8 Monash with RR = 1.8 (default) and RR = 1.4,denoted by the green and orange solid lines, respectively.Markers correspond to data and error bars are the associated systematic uncertainties.Bottom panel shows the model-to-data ratio.

Figure 6 :
Figure 6: PYTHIA 8 Monash spherocity distributions for low (left), intermediate (center ) and high (right) multiplicity.Green line displays the spherocity-integrated result and the other two lines are obtained for events surviving the jet excess removal: red for the loose selection and magenta for the tight one.The corresponding mean values of the distributions are also shown.

Figure 7 :
Figure 7: Average p T as a function of dN ch /dη for pp collisions at √ s = 13 TeV for two different spherocity classes: isotropic (left) and jetty (right).Black markers correspond to data and error bars are the associated systematic uncertainties.Data are compared to three different Monte Carlo predictions (solid lines).Green line displays the result considering INEL>0 collisions and the other two lines are obtained for events surviving the jet excess removal: red for the loose selection and magenta for the tight one.Bottom panel presents the data to model ratio, where the shaded area around unity is the systematic uncertainty.