Optimized Probes of the CP Nature of the Top Quark Yukawa Coupling at Hadron Colliders

Darius A. Faroughy; Blaž Bortolato; Jernej F. Kamenik; Nejc Košnik; Aleks Smolkovič

doi:10.3390/sym13071129

,

and

¹

Physik-Institut, Universitat Zurich, CH-8057 Zurich, Switzerland

²

Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia

³

Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, 1000 Ljubljana, Slovenia

^*

Author to whom correspondence should be addressed.

Symmetry2021, 13(7), 1129;https://doi.org/10.3390/sym13071129

This article belongs to the Special Issue Higher Order Radiative Corrections in QCD

Version Notes

Order Reprints

Abstract

We summarize our recent proposals for probing the

C P

-odd

i \tilde{κ} \bar{t} γ^{5} t h

interaction at the LHC and its projected upgrades directly using associated on-shell Higgs boson and top quark or top quark pair production. We first recount how to construct a

C P

-odd observable based on top quark polarization in

W b \to t h

scattering with optimal linear sensitivity to

\tilde{κ}

. For the corresponding hadronic process

p p \to t h j

we then present a method of extracting the phase-space dependent weight function that allows to retain close to optimal sensitivity to

\tilde{κ}

. For the case of top quark pair production in association with the Higgs boson,

p p \to t \bar{t} h

, with semileptonically decaying tops, we instead show how one can construct manifestly

C P

-odd observables that rely solely on measuring the momenta of the Higgs boson and the leptons and b-jets from the decaying tops without having to distinguish the charge of the b-jets. Finally, we introduce machine learning (ML) and non-ML techniques to study the phase-space optimization of such

C P

-odd observables. We emphasize a simple optimized linear combination

α \cdot ω

that gives similar sensitivity as the studied fully fledged ML models. Using

α \cdot ω

we review sensitivity projections to

\tilde{κ}

at HL-LHC, HE-LHC, and FCC-hh.

Keywords:

top quark; Higgs boson; CP violation; optimized observables; beyond the Standard Model

1. Introduction

The interaction between the heaviest particles of the Standard Model (SM), the top quark t and the Higgs boson h, is an important target for the LHC experiments. It is precisely predicted within the SM. The measured top quark mass

m_{t}

and the electroweak condensate value v precisely determine the on-shell scalar (P- and

C P

-even) coupling

y_{t} = \sqrt{2} m_{t} / v

, while P- and

C P

-odd interactions are absent. Beyond the SM, effective operators of dimension-6 can break this correlation and result in more general (pseudo)scalar

\bar{t} t h

couplings

κ (\tilde{κ})

[1]

L_{h t} = - \frac{y_{t}}{\sqrt{2}} \bar{t} (κ + i \tilde{κ} γ_{5}) t h,

(1)

which reduce to the SM case at

κ = 1

,

\tilde{κ} = 0

.

C P

-violating h couplings, like

\tilde{κ}

, are particularly interesting as any sign of

C P

violation in Higgs processes would constitute an indisputable New Physics (NP) signal. Existing data on Higgs production and decays is already precise enough to constrain any isolated modification of the top Yukawa to

O (1)

[2,3,4]. However, all existing measurements are based on

C P

-even observables with very limited sensitivity to

C P

-odd modifications of the top quark Yukawa. In principle, indirect collider bounds from Higgs decay and production (

g g \to h

,

h \to γ γ

), and especially the low-energy bounds on electric dipole moments (EDMs) of atoms and nuclei that target specifically

C P

-odd effects [2,5,6], are currently more constraining than direct collider probes. However, these constraints are subject to assumptions about other Higgs interactions, and in particular in the case of EDMs also other contributions unrelated to the Higgs.

At the LHC it is possible to probe these couplings directly with two of the particles in Equation (1) on-shell (Since

m_{h} < 2 m_{t}

one cannot probe these couplings with all the three particles on-shell) in top-Higgs associated production processes

p p \to t h j

and

p p \to t \bar{t} h

[2,3,4,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23] (The loop induced partonic process

g g \to h \to t \bar{t}

depends on

κ^{2}

,

κ

, and

{\tilde{κ}}^{2}

already on the production side as it is dominated by the top quark loop [5]). The corresponding total cross sections scale as

κ^{2}

,

{\tilde{κ}}^{2}

(

t h j

also as

κ

), and are thus poorly sensitive to small nonzero

\tilde{κ}

. Linear sensitivity to

\tilde{κ}

on the other hand can be achieved by measuring P- and

C P

-odd observables.

We have recently addressed this challenge by identifying observables with optimal sensitivity to a single

C P

-odd parameter in both

t h

and

t \bar{t} h

associated production at the LHC, which can be realistically measured and exhibit close to optimal sensitivity to

C P

-odd interactions between the Higgs boson and the top quark [24]. The proposed observables in

t h

are based on optimization of top-spin correlations previously studied in

t \bar{t}

production [25]. Unfortunately, the overwhelming irreducible backgrounds make the

t h j

channel impractical. On the other hand, in the case of

\bar{t} t h

this procedure becomes intractable in practice and our construction relies instead on

C P

- and P-symmetry arguments. In total, we can identify 21 different

C P

-odd observables that can be constructed out of 5 measurable final state momenta and an additional triple-product asymmetry [26]. Namely, assuming

p p \to t \bar{t} h

production with semileptonically decaying tops, we combine the final state lepton momenta

p_{ℓ^{+}}, p_{ℓ^{-}}

, two b-jet momenta

p_{b}, p_{\bar{b}}

from decaying top quarks (although without discriminating their charges (Efficient b-jet charge discrimination could allow to construct further

C P

-odd observables, see e.g., [6,20])) and the Higgs momentum p_h in different ways to construct C-even, P-odd laboratory frame observables w_i [24].

Due to the high dimensionality and complexity of the phase-space in this process with top quarks decaying semileptonically, using the complete kinematical information accessible experimentally to construct an optimal

C P

-odd observable is challenging. To this end we have employed neural networks (NN) trained on Monte-Carlo generated samples to efficiently parametrize the weight function of events across the multi-dimensional phase-space in order to maximize the statistical sensitivity to

\tilde{κ}

[27]. We show how the required P- and

C P

-symmetry properties of the NN-based observables can be imposed a priori. Finally, we compare in terms of optimality, a general

C P

-odd NN function of the phase-space to a linear combination of manifestly

C P

-odd variables.

The present paper serves as a pedagogical review of the work first reported in Refs. [24,27].

2. Optimal CP-Odd Observable in the bW → th Process

2.1. Parton Level $W b \to t h$ Analysis

We begin by studying the effects of

\tilde{κ}

on top spin observables in the idealized case of single top quark production in partonic process

W (p) b (p^{'}) \to t (k) h (k^{'})

. Here the complete polarized scattering amplitudes can be found in a compact analytic form. This process can actually be connected to a more realistic

p p \to t h j

production in the high energy limit, where the W and b quark mass effects are negligible and the collinear emission of both initial state ‘partons’ can be described by the corresponding parton distribution functions. (See e.g., Section 3 of Ref. [28] for an extended discussion on the validity of this approximation) Three diagrams contribute to such parton level Higgs-top production in the SM, shown in Figure 1. Neglecting furthermore the mass (and thus the corresponding Yukawa coupling) of the bottom quark, we consider only the first two of the diagrams in Figure 1. The formalism presented here is based on Refs. [25,29,30]. First, we introduce the spin projection operator [31]

P (s) = \frac{1}{2} (1 + γ_{5} s),

(2)

where

s^{μ}

is a top spin four-vector, defined in a general frame as

s^{μ} = (\frac{k \cdot \hat{s}}{m_{t}}, \hat{s} + \frac{k (k \cdot \hat{s})}{m_{t} (E_{t} + m_{t})}) .

(3)

Figure 1. Tree-level diagrams contributing to

W b \to t h

.

Vector

k

is the top quark momentum and

\hat{s}

is an arbitrary unit vector that represents polarization of the top quark. The physical significance of

\hat{s}

is revealed if we make a rotation-free boost to the top rest frame where we find

s^{* μ} = (0, \hat{s})

. Note that spatial component of four-vector

x^{μ}

in this case transforms as

x^{*} = x + (\frac{x \cdot k}{m_{t} (E_{t} + m_{t})} - \frac{x^{0}}{m_{t}}) k

upon boost to the top rest frame. Therefore

s^{2} = - 1

,

s \cdot k = 0

, and

\hat{s}

corresponds to the polarization of the top quark in its rest frame. Projection onto a well defined polarization of the top quark is achieved by placing the projector (2) in front of the bispinors at the amplitude level:

\begin{matrix} u & \to P (s) u, \end{matrix}

(4)

\begin{matrix} v & \to P (s) v . \end{matrix}

(5)

Sum over top quark polarizations r then becomes

\begin{matrix} \sum_{r} u (k, r) \bar{u} (k, r) & \to P (s) (k + m) P (s) = (k + m) P (s), \\ \sum_{r} v (k, r) \bar{v} (k, r) & \to P (s) (k - m) P (s) = (k - m) P (s), \end{matrix}

(6)

and the cross-section is linear in

s^{μ}

{| M |}^{2} = a + b_{μ} s^{μ} .

(7)

Here the polarization vector

\hat{s}

is arbitrary while

b_{μ}

contains all the information about the polarization of the top quark in the given process. The parton level cross section can be written as

d σ = Φ_{in} {| M |}^{2} d Γ_{t h},

(8)

where

Φ_{in}

is the initial state flux normalization and

d Γ_{t h}

is the

t h

phase space volume. On the other hand, in the top rest frame it is convenient to introduce the spin density matrix as

ρ = A + B_{i} σ_{i},

(9)

such that the unpolarized cross section is proportional to

\bar{{| M |}^{2}} = Tr [ρ] = 2 A

. Here

σ

are the Pauli matrices. In the density matrix formalism, the expectation value of a generic operator is obtained as

⟨O⟩ = Tr [ρ O] .

(10)

In particular, the polarized cross section along

\hat{s}

is obtained as the expectation value of the projector:

{| M |}^{2} = Tr [ρ \frac{1}{2} (1 + \hat{s} \cdot σ)] = A + B_{i} {\hat{s}}_{i} .

(11)

One can determine the rest-frame coefficients

A, B_{i}

from a,

b_{μ}

by comparing the expressions for polarized

{| M |}^{2}

, expressed via Equations (7) and (11). The result of this matching are explicit expressions:

A = a, B_{i} = - b^{i} + \frac{1}{m_{t}} (b^{0} - \frac{b \cdot k}{E_{t} + m_{t}}) k .

(12)

The rest-frame polarization of the top quark along a vector

\hat{s}

is given by the expectation value of

O_{\hat{s}} = \hat{s} \cdot \frac{σ}{2}

,

⟨O_{\hat{s}}⟩ = b \cdot \hat{s} .

(13)

This observable can be determined for example by measuring the angular distribution of the charged lepton in the semi-leptonic top decay

t \to b (W \to ℓ ν)

since the charged lepton in top decay is considered to be an almost perfect top spin analyzer, i.e., the angular decay distribution vanishes when the lepton momentum is opposite to the spin of t [32]. The angular distribution in the top rest frame

\frac{1}{Γ_{t}} \frac{d Γ_{t}}{d cos θ_{ℓ}} = \frac{1}{2} (1 + ⟨O_{\hat{s}}⟩ cos θ_{ℓ})

(14)

then allows for experimental extraction of the

B^{i}

coefficients [32].

Here

θ_{ℓ}

is an angle between the lepton and the polarization axis

\hat{s}

in the top rest frame. The above construction shows that the vector

\hat{s}

is an arbitrary unit vector defined in the laboratory frame. A particular choice

\hat{s} = k / | k |

implies that

⟨O_{\hat{s}}⟩

measures the top quark helicity. Another natural choice for

\hat{s}

is the W momentum

\hat{p}

, also known as the beam basis, which has to be carefully defined in actual

p p

collisions where the (symmetric) initial state does not allow to define forward and backward directions. Experimentally one has to reconstruct the top quark rest frame in order to be able to trace the angular distribution of the lepton with respect to the chosen

\hat{s}

and gain access to the coefficients

B_{i}

. In the following we will show how to optimize the choice of

\hat{s}

such that the sensitivity to the

C P

-violating parameter

\tilde{κ}

is maximized.

In the

W b

center-of-mass frame we can define the W and t momenta as

\begin{matrix} \hat{p} & = (0, 0, 1), \\ \hat{k} & = (sin θ, 0, cos θ), \end{matrix}

(15)

where

θ

is the angle between the direction of the top quark and the W boson. We have set the azimuthal angle

ϕ = 0

without loss of generality. The polarization vector components

B_{i}

in this case depend on

x = cos θ

. Evaluation of the two diagrams in Figure 1 leads to the coefficients

B_{1, 2, 3} (x)

of the polarized cross-section. It turns out that in the coordinate system in Equation (15) the analytical expression for

B_{2} (x)

is linear in

\tilde{κ}

,

B_{2} (x) \equiv β (x) \tilde{κ},

(16)

whereas

B_{1, 3}

do not contain linear

\tilde{κ}

terms. Effectively this means that we should choose the vector

\hat{s}

to be orthogonal to the plane spanned by the W and t momenta in order to probe

\tilde{κ}

with linear sensitivity. Similar results have been reported in Ref. [18]. In pursuit of maximal sensitivity to

\tilde{κ}

we choose the polarization vector in each event as

\hat{s} = \hat{p} \times \hat{k} / | \hat{p} \times \hat{k} |

. In this case an interesting experimental quantity a two-fold differential cross-section

\begin{matrix} \frac{d^{2} σ}{d x d cos θ_{ℓ}} (W b \to h b ℓ ν) & = Σ (x, \hat{s}) \frac{Br (t \to b ℓ ν)}{2} (1 + cos θ_{ℓ}) \\ + Σ (x, - \hat{s}) \frac{Br (t \to b ℓ ν)}{2} (1 - cos θ_{ℓ}), \end{matrix}

(17)

where we have approximated the intermediate top quark as a narrow resonance and

Σ (x, \hat{s}) = d σ / d x (W b \to t^{(\hat{s})} h)

is the differential production cross section for the top quarks polarized in the

\hat{s}

direction. Using Equation (11) and inserting

\hat{s}

we have

Σ (x, \pm \hat{s}) = Φ_{in} (A (x) \pm \tilde{κ} β (x))

, where

Φ_{in}

is the initial flux normalization. Thus we can write Equation (17) as

\frac{d^{2} σ}{d x d cos θ_{ℓ}} (W b \to h b ℓ ν) = Φ_{in} Br (t \to b ℓ ν) (A (x) + \tilde{κ} β (x) cos θ_{ℓ}) .

(18)

Treating

\tilde{κ}

as a small perturbation we can integrate the distribution in Equation (18) with a phase-space dependent function f that would maximize statistical sensitivity of the integral to

\tilde{κ}

. It has been shown in Refs. [33,34] that such an optimal function should be the ratio of the

\tilde{κ}

-perturbation to the unperturbed distribution, in our case

f (x, cos θ_{ℓ}) = \frac{β (x)}{A (x)} cos θ_{ℓ}

. The optimal observable is thus

O_{opt}^{W b \to t h} . \equiv \frac{1}{σ} \int d x d cos θ_{ℓ} \frac{d^{2} σ}{d x d cos θ_{ℓ}} \frac{β (x)}{A (x)} cos θ_{ℓ} = \frac{1}{N} \sum_{i = 1}^{N} \frac{β (x_{i})}{A (x_{i})} cos θ_{ℓ, i},

(19)

where

θ_{ℓ}

is the angle between

\hat{s}

and the lepton momentum in the top center-of mass-frame, as defined in the preceding paragraph. The index

i = 1, \dots, N

labels individual events. The prediction scales as

⟨β^{2}⟩

,

O_{opt .}^{W b \to t h} = \frac{\tilde{κ}}{3} [\int d x \frac{{[β (x)]}^{2}}{A (x)}] / [\int d x A (x)],

(20)

where we have integrated over

cos θ_{ℓ}

and left the bounds for

x = cos θ

unspecified. The function

β (x)

is plotted in Figure 2.

Figure 2. Comparison of the

β (x)

(dashed and dotted) and

\tilde{β} (\tilde{x})

(full line) polarization functions at representative CMS energies

\sqrt{s}

and two values of

κ

. We find that

\tilde{β} (\tilde{x})

is independent of

κ

[24].

To carry over the presented formalism to the realistic case of

p p

collisions, we have to adapt the beam axis by referring only to experimentally accessible momenta. Using the reconstructed top momentum

k

as a reference, we define the positive z-direction as the parallel top quark momentum projection

{\hat{k}}_{‖}

. The top quark is then by definition in the positive hemisphere,

\tilde{x} = cos \tilde{θ} \geq 0

, where

\tilde{θ}

is the angle between

k

and

{\hat{k}}_{‖}

. The optimal polarization direction with linear

\tilde{κ}

sensitivity now becomes

\hat{s} = {\hat{k}}_{‖} \times {\hat{k}}_{⊥}

upon which an experiment should measure the lepton angle

{\tilde{θ}}_{ℓ}

. The cross-section distributions in

\tilde{x}

and x are related via

\begin{matrix} \frac{d^{2} σ}{d \tilde{x} d cos {\tilde{θ}}_{ℓ}} & = {\frac{d^{2} σ}{d x d cos θ_{ℓ}}|}_{x = \tilde{x}, cos θ_{ℓ} = cos {\tilde{θ}}_{ℓ}} + {\frac{d^{2} σ}{d x d cos θ_{ℓ}}|}_{x = - \tilde{x}, cos θ_{ℓ} = - cos {\tilde{θ}}_{ℓ}} \\ = Φ_{in} \frac{Br (t \to b ℓ ν)}{2} [\tilde{A} (\tilde{x}) + \tilde{κ} cos {\tilde{θ}}_{ℓ} \tilde{β} (\tilde{x})], \end{matrix}

(21)

where

\begin{matrix} \tilde{A} (\tilde{x}) & \equiv A (\tilde{x}) + A (- \tilde{x}), \\ \tilde{β} (\tilde{x}) & \equiv β (\tilde{x}) - β (- \tilde{x}) . \end{matrix}

(22)

The

cos θ_{ℓ}

is flipped in the second term since for

\tilde{x} = - x

the polarization vector

\hat{s} = {\hat{k}}_{‖} \times {\hat{k}}_{⊥}

flips the direction compared to the previous definition,

\hat{s} \sim p \times k

. The optimal observable in this case is finally

\begin{matrix} {\tilde{O}}_{opt .}^{W b \to t h} & \equiv \frac{1}{σ} \int d \tilde{x} d cos {\tilde{θ}}_{ℓ} \frac{d^{2} σ}{d \tilde{x} d cos {\tilde{θ}}_{ℓ}} cos {\tilde{θ}}_{ℓ} \frac{\tilde{β} (\tilde{x})}{\tilde{A} (\tilde{x})} \\ = \frac{\tilde{κ}}{3} [\int d \tilde{x} \frac{{[\tilde{β} (\tilde{x})]}^{2}}{\tilde{A} (\tilde{x})}] / [\int d \tilde{x} \tilde{A} (\tilde{x})], \end{matrix}

(23)

where the optimal weight function is again taken to be the ratio of the

\tilde{κ}

-perturbation to the unperturbed distribution (see Equation (21)), in line with Refs. [33,34]. In terms of experimental N data points with reconstructed

{\tilde{θ}}_{ℓ}

and

\tilde{x}

the observable is obtained as the following average:

{\tilde{O}}_{opt .}^{W b \to t h} = \frac{1}{N} \sum_{i = 1}^{N} \frac{\tilde{β} ({\tilde{x}}_{i})}{\tilde{A} ({\tilde{x}}_{i})} cos {\tilde{θ}}_{ℓ, i} .

(24)

In the limit where

β (x) = - β (- x)

the observables are equal,

{\tilde{O}}_{opt .}^{W b \to t h} = O_{opt .}^{W b \to t h}

. However in general the

{\tilde{O}}_{opt .}^{W b \to t h}

is expected to result in a weaker statistical significance due to our inability to determine the direction of the top quark with respect to the initial W. In Figure 2 we show that

β (x)

is large at negative x and we have

\tilde{β} (\tilde{x}) \approx - β (- \tilde{x})

, for representative values of the center-of-mass energy

\sqrt{s}

.

2.2. Hadronic Process $p p \to t h j$

Here we demonstrate how to carry over the optimal

C P

-violating observable from parton level to the realistic case of

p p

collisions, but still neglecting reconstruction efficiencies and backgrounds. The parton level observable defined in Equation (23) can be adapted to this case with an additional integration over the parton distribution functions (PDFs). Since the hadronic cross section is a convolution of partonic cross sections it can be split into a

\tilde{κ}

-independent piece and the small perturbation proportional to

\tilde{κ}

, similar to the partonic cross section in Equation (21). Assuming that the Higgs decays into visible states, the missing

p_{T}

is only due to the neutrino originating from the top decay. Thus we can reconstruct the top quark momentum and kinematic quantities of Equation (21). For hadronic collisions one can express the cross section as

\frac{d^{2} σ^{p p \to t h j}}{d \tilde{x} d cos {\tilde{θ}}_{ℓ}} = A (\tilde{x}) + \tilde{κ} cos {\tilde{θ}}_{ℓ} B (\tilde{x}),

(25)

and weigh the events with the optimal

f_{opt} . \propto cos {\tilde{θ}}_{ℓ} B / A

. To demonstrate the procedure we have used the Monte Carlo event generator

MadGraph 5

[35,36] together with the Higgs Characterisation UFO model [37,38] (for an analysis of next-to-leading order QCD and next-to-next-to leading EW effects see Refs. [8,17,39], respectively) to incorporate the

κ

and

\tilde{κ}

couplings in the simulation of the

p p \to t (\to b ℓ ν) h j

signal. The procedure of extracting the optimal weight function

B / A

from MC simulations and using it to produce the optimal observable goes as follows:

Choose the bins for $\tilde{x}$ between ${\tilde{x}}_{\min} \geq 0$ and ${\tilde{x}}_{\max} \leq 1$ .
Fix $\tilde{κ}$ and extract from the MC simulation the mean $⟨ cos {\tilde{θ}}_{ℓ} ⟩$ in each of the $\tilde{x}$ bins. The obtained value corresponds to the weight $\frac{1}{3} B / A$ in this bin, see Equation (25).
Use this information to weigh experimental events bin-by-bin with $f_{opt} . \propto cos {\tilde{θ}}_{ℓ} B / A$ . The normalization of $f_{opt} .$ is fixed by the requirement $\int d \tilde{x} B / A = 1$ .

This optimization procedure is independent of the

\tilde{κ}

value as long as

\tilde{κ}

is sufficiently small. The resulting optimal weight

B / A

is shown in Figure 3, where we compare it for different final states (

t h j

or

\bar{t} h j

) and collision energies (14 or

27 TeV

). To assess the stability of the proposed method against higher order corrections we have extracted the weight function from simulations at NLO in QCD to estimate the systematic uncertainty associated with QCD effects and found that the difference is within 10% of the LO extraction. Finally, we can compare our optimized approach to the naïve

\tilde{κ}

extraction through the measurement of

O_{na ï ve} = ⟨cos θ_{ℓ}⟩

, which in turn corresponds to the case where the weight is independent of

\tilde{x}

, i.e.,

B / A = 1

. We define the signal significance of an observable

O

as

Sig (\tilde{κ}) = \frac{O (\tilde{κ})}{σ_{O} (\tilde{κ})} .

(26)

Figure 3. Comparison of the optimal weight

B / A

between the

p p \to t h j

and

p p \to \bar{t} h j

processes extracted from MC simulations (left). The right panel shows the comparison between 14 and

27 TeV

proton collision energies for

p p \to t h j

. All plots are obtained using

\tilde{κ} = 1

and with

10^{6}

MC events [24].

In Figure 4 we show the improvement of the significance when the optimal weight function is applied on simulated signal events without showering or reconstruction effects at 14 TeV.

Figure 4. Left: comparison of the optimized spin observable (blue dots) with the naïve observable (black dots) extracted from 3000

p p \to t h j, t \to b ℓ ν

MC events at each choice of

\tilde{κ}

. Right: comparison of the significance (defined as the mean value divided by the standard deviation) per

\sqrt{N}

of the two observables, where N is the number of events [24].

2.3. Limits in the $(κ, \tilde{κ})$ Plane from $p p \to t h j$ at Event Reconstruction Level

In order to make closer contact with experiments, we can also include the effects of parton showering, detector response and background processes. In our analysis we have used

MadGraph 5

to generate events at leading order (LO) in QCD for the signal process

p p \to t (\to b ℓ ν) h (\to b \bar{b}) j

plus the conjugate process with

\bar{t}

at 14 TeV High-Luminosity LHC (HL-LHC) and 27 TeV High-Energy LHC (HE-LHC) center-of-mass energies. (Note that our procedure of obtaining an optimal observable does not depend on the h decay products, therefore this analysis should be taken as a proof of concept with potential for future improvements using e.g., multiple h decay channels) Event generation was performed for multiple values of (

κ

,

\tilde{κ})

. The parton level events were subsequently showered and hadronized with

Pythia 8

[40], and jets are clustered with the anti-

k_{T}

algorithm using

FastJet

[41]. For detector simulation and final state object reconstruction (e.g., lepton isolation and b-tagging) we used

Delphesv 3.3.3

[42] with the default ATLAS parameters in

delphes_card_ATLAS . tcl

. The dominant background process in this analysis is

t \bar{t}

production with additional associated jets. We included this background by generating

p p \to t \bar{t}

samples, with one of the tops decayed into the semi-leptonic channel and the other one decayed into the hadronic channel, produced in association with 0, 1, and 2 hard jets. In order to correctly model the hard jets’ distributions, we merged the matrix element computations with the MC shower using the

MLM

[43] prescription. For the event selection we demand the following basic requirements:

Exactly 3 b-tagged jets with $| η (b) | < 5$ and $p_{T} (b) > 20$ GeV,
One additional (non-tagged) light jet exclusively in the forward direction with $2 < | η (j) | < 5$ and $p_{T} (j) > 20$ GeV,
One isolated light lepton $ℓ^{\pm} = e^{\pm}, μ^{\pm}$ with $| η (ℓ) | < 2.5$ and $p_{T} (ℓ) > 10$ GeV.

In addition, we further select events with one reconstructed Higgs and one reconstructed top quark as follows: first, we calculate the three possible invariant masses from the three reconstructed b-jets (

m_{b b}

) and only keep the event if at least one

b b

pair satisfies

| m_{H} - m_{b b} | < 15

GeV. For such events, we select as the Higgs decay candidate

h \to b \bar{b}

for the pair of b-jets with the invariant mass closest to the Higgs mass. The remaining non-Higgs b-jet is then assumed to come from the top-quark decay. Next, we reconstruct the top-quark by requiring that the combined invariant mass

m_{b l ν}

of the remaining b-jet, the lepton, and the neutrino (also reconstructed by assuming it to be the unique source of missing energy in the event) to fall inside the mass window of the top-quark defined by

m_{t} \pm 35

GeV. In order to further reject the

t \bar{t}

backgrounds, events with a reconstructed Higgs and top are selected if the combined invariant mass of the b-jets originating from the Higgs and the light jet satisfies the cut

m_{b b j} > 280

GeV [44]. The final selection efficiency for the

t h j

signal in the SM is

0.32 %

(

0.23 %

), while for the background it is

0.008 %

(

0.006 %

) at 14 TeV (27 TeV).

As we fully reconstruct the

t h

system and have access to the lepton momentum from the top decay we have all the necessary information for measuring the optimized spin observable. We use the optimal weight function

B / A

(Figure 3) extracted from the MC simulations to construct a

χ^{2}

with an appropriately weighted signal process. Our results for

p p \to t h j

generated in the SM are given by the

2 σ

exclusion limits (shaded blue) shown in Figure 5 for the HE-LHC at a luminosity of

15 {ab}^{- 1}

. As can be seen in Equation (23) the observable

{\tilde{O}}_{opt} .

is normalized to the cross section, which contains terms

κ^{2}

,

{\tilde{κ}}^{2}

, as well as a linear term in

κ

and a constant term due to second diagram in Figure 1, whereas the numerator is proportional to

\tilde{κ} (κ + c)

. The behaviour of

O_{opt}

close to the SM point is thus linear in

\tilde{κ}

, whereas the cross section has a minimum in

κ

close to

κ = 1

. In the large coupling regime

{\tilde{O}}_{opt} .

converges to a small value which depends on the direction in which we make the limit

κ^{2} + {\tilde{κ}}^{2} \to \infty

. The

2 σ

exclusion has an elliptic shape as shown by the blue contour in Figure 5. We also present the limit (given by the black elliptic contour) assuming a

2 σ

positive excess above the SM expectation corresponding to a measurement of the optimized spin observable of

{\tilde{O}}_{opt .} = 0.06 \pm 0.03

whose size and error are statistics-driven. Because of the nature of our observable, the signed fluctuation gives rise to asymmetric limits in the

\tilde{κ}

direction. In the

κ

direction the bounds are also not symmetric as

p p \to t h j

production contains linear

κ

terms. Finally, in order to include background effects, the same statistical analysis would have to be repeated including the

t \bar{t}

background in the

χ^{2}

fit. However, even with a large background rejection as implemented above, the irreducible background is simply too large and the signal is completely diluted leading to a signal significance of only

S / \sqrt{B} \sim 0.8 (3.2)

at 14 TeV (27 TeV) at a luminosity of

3 {ab}^{- 1}

(

15 {ab}^{- 1}

). This effectively precludes any meaningful extraction of bounds on

\tilde{κ}

from a fit to

{\tilde{O}}_{opt .}

. We leave the possibility of further optimizing the cuts in order to reduce the backgrounds or including other Higgs decay channels as a future challenge. In the remainder of this review we instead focus on the related but more abundant process of associated top quark pair and Higgs boson production.

Figure 5. Bounds in the

(κ, \tilde{κ})

plane using the optimized observable

{\tilde{O}}_{opt}

for the single-top associated production with a Higgs boson. The blue shaded region corresponds to the

2 σ

(

χ^{2} > 6.18

) exclusion zone assuming the measurement of the SM at the HE-LHC (15 ab

^{- 1}

). The black line and stripes shows the

2 σ

excluded region for a

2 σ

positive fluctuation at the HE-LHC (see text for details) [24].

3. CP-Odd Observables in the $pp \to t \bar{t} h$ Process

In this Section we introduce laboratory-frame accessible and phase-space optimized

C P

-odd observables in the process

p p \to t \bar{t} h

in which both top quarks decays semi-leptonically:

t \to b l^{+} ν

and

\bar{t} \to \bar{b} l^{-} \bar{ν}

. This process has a higher S/B ratio compared to

p p \to t h j

and has indeed been measured by the LHC collaborations [45,46] (For the state of the art predictions of the differential distributions see e.g., Ref. [39]) The top quarks in

p p \to t \bar{t} h

are known to be unpolarized, independent of the value of

\tilde{κ}

[2]. The information on the underlying

κ

and

\tilde{κ}

parameters is in principle contained in the correlations among the top quark spins, however due to the experimental difficulties of extracting the top quark polarizations in this process, this approach is unfeasible. This is the reason we focus in this Section on manifestly

C P

-odd observables, directly sensitive to

\tilde{κ}

and easily accessible by measurements of final-state momenta in the lab frame.

3.1. Laboratory Frame $C P$ -Odd Observables in $t \bar{t} h$ Production

The accessible final-state momenta [6] of this decay in the lab frame are 3-momenta of b-jet

p_{b}

,

\bar{b}

-jet

p_{\bar{b}}

, 3-momenta of leptons

p_{ℓ^{+}}

and

p_{ℓ^{-}}

, and 3-momenta of the Higgs

p_{h}

. In practice differentiating between b-jet and b-jet is difficult (For recent attempts in extracting the charge of theb-jet see References [47,48,49,50]), therefore we should consider only observables which are invariant under p_b ↔ p_b transformation. We construct these observables from 3-vector quantities with well defined C and P eigenvalues, that are given in Table 1.

Table 1. Vector quantities with well-defined C and P eigenvalues in 3 dimensional Euclidean space. More complicated objects with well-defined C and P eigenvalues can be constructed using variables in this table.

We use quantities in Table 1 to construct

C P

-odd variables

ω

of mass-dimension 5, that are C-even and invariant under

p_{b} \leftrightarrow p_{\bar{b}}

transformation. In order to systematically obtain all distinct

ω

’s we proceed as follows. First, we construct variables of the form

ω \sim V_{1} \times V_{2} \cdot V_{3} V_{4} \cdot V_{5}

(Notice that the possibility of a nested cross product

(((V_{1} \times V_{2}) \times V_{3}) \times V_{4}) \cdot V_{5}

can also be reduced to this form) using

V_{j} \in {p_{h}, p_{ℓ^{-}} + p_{ℓ^{+}}, p_{ℓ^{-}} - p_{ℓ^{+}}, p_{b} + p_{\bar{b}}, p_{b} - p_{\bar{b}}}

for

j \in {1, . . ., 5}

. Doing so, we find 150 potential quintuple products. We symmetrize them with respect to C-conjugation and

p_{b} \leftrightarrow p_{\bar{b}}

transformation. The non-zero quintuple products are

ω

variables, however they may be linearly dependent. Indeed some of the obtained

ω

’s are connected via the following Euclidean identity

δ_{a b} ϵ_{c d e} - δ_{a c} ϵ_{d e b} + δ_{a d} ϵ_{e b c} - δ_{a e} ϵ_{b c d} = 0,

(27)

which can be written as:

a (b \times c \cdot d) - b (c \times d \cdot a) + c (d \times a \cdot b) - d (a \times b \cdot c) = 0,

(28)

where

a

,

b

,

c

and

d

are four arbitrary vectors in 3 dimensional Euclidean space. The sign of individual terms in the last expression corresponds to the sign of the cyclic permutation of the four vectors.

The first class of

ω

’s involves

p_{ℓ^{+}}

and

p_{ℓ^{-}}

in the mixed product,

p_{ℓ^{-}} - p_{ℓ^{+}}

in the scalar product. Both products are invariant under

p_{b} \leftrightarrow p_{\bar{b}}

:

\begin{matrix} ω_{1} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot p_{h}] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot p_{h}], \end{matrix}

(29)

\begin{matrix} ω_{2} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot p_{h}] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})], \end{matrix}

(30)

\begin{matrix} ω_{3} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot p_{h}] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})], \end{matrix}

(31)

\begin{matrix} ω_{4} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot p_{h}], \end{matrix}

(32)

\begin{matrix} ω_{5} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})], \end{matrix}

(33)

\begin{matrix} ω_{6} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] . \end{matrix}

(34)

The second class involves

p_{b} \times p_{\bar{b}}

and/or

p_{b} - p_{\bar{b}}

in both mixed and scalar products:

\begin{matrix} ω_{7} & \sim [(p_{b} \times p_{\bar{b}}) \cdot p_{h}] [(p_{b} - p_{\bar{b}}) \cdot p_{h}], \end{matrix}

(35)

\begin{matrix} ω_{8} & \sim [(p_{b} \times p_{\bar{b}}) \cdot p_{h}] [(p_{b} - p_{\bar{b}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})], \end{matrix}

(36)

\begin{matrix} ω_{9} & \sim [(p_{b} \times p_{\bar{b}}) \cdot p_{h}] [(p_{b} - p_{\bar{b}}) \cdot (p_{b} + p_{\bar{b}})], \end{matrix}

(37)

\begin{matrix} ω_{10} & \sim [(p_{b} \times p_{\bar{b}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})] [(p_{b} - p_{\bar{b}}) \cdot p_{h}], \end{matrix}

(38)

\begin{matrix} ω_{11} & \sim [(p_{b} \times p_{\bar{b}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})] [(p_{b} - p_{\bar{b}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})], \end{matrix}

(39)

\begin{matrix} ω_{12} & \sim [(p_{b} \times p_{\bar{b}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})] [(p_{b} - p_{\bar{b}}) \cdot (p_{b} + p_{\bar{b}})], \end{matrix}

(40)

\begin{matrix} ω_{13} & \sim [(p_{b} \times p_{\bar{b}}) \cdot (p_{ℓ^{-}} - p_{ℓ^{+}})] [(p_{b} - p_{\bar{b}}) \cdot (p_{ℓ^{-}} - p_{ℓ^{+}})], \end{matrix}

(41)

\begin{matrix} ω_{14} & \sim [(p_{ℓ^{-}} \times p_{ℓ^{+}}) \cdot (p_{b} - p_{\bar{b}})] [(p_{b} - p_{\bar{b}}) \cdot (p_{ℓ^{-}} - p_{ℓ^{+}})] . \end{matrix}

(42)

The third class involves mixed product of

p_{h}, p_{ℓ^{-}} \pm p_{ℓ^{+}}

, and

p_{b} \pm p_{\bar{b}} :

:

\begin{matrix} ω_{15} & \sim [p_{h} \times (p_{ℓ^{-}} + p_{ℓ^{+}}) \cdot (p_{b} - p_{\bar{b}})] [(p_{b} - p_{\bar{b}}) \cdot p_{h}], \end{matrix}

(43)

\begin{matrix} ω_{16} & \sim [p_{h} \times (p_{ℓ^{-}} + p_{ℓ^{+}}) \cdot (p_{b} - p_{\bar{b}})] [(p_{b} - p_{\bar{b}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})], \end{matrix}

(44)

\begin{matrix} ω_{17} & \sim [p_{h} \times (p_{ℓ^{-}} + p_{ℓ^{+}}) \cdot (p_{b} - p_{\bar{b}})] [(p_{b} - p_{\bar{b}}) \cdot (p_{b} + p_{\bar{b}})], \end{matrix}

(45)

\begin{matrix} ω_{18} & \sim [p_{h} \times (p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot p_{h}], \end{matrix}

(46)

\begin{matrix} ω_{19} & \sim [p_{h} \times (p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{ℓ^{-}} + p_{ℓ^{+}})], \end{matrix}

(47)

\begin{matrix} ω_{20} & \sim [p_{h} \times (p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})], \end{matrix}

(48)

\begin{matrix} ω_{21} & \sim [p_{h} \times (p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} - p_{\bar{b}})] [(p_{ℓ^{-}} - p_{ℓ^{+}}) \cdot (p_{b} - p_{\bar{b}})] . \end{matrix}

(49)

There are further possibilities with a mixed product

p_{h} \times (p_{ℓ^{-}} + p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}})

multiplied by one of 8 C-even,

b \leftrightarrow \bar{b}

even scalar products

{p_{h} \cdot p_{h}, p_{h} \cdot (p_{ℓ^{-}} + p_{ℓ^{+}}), p_{h} \cdot (p_{b} + p_{\bar{b}}), (p_{ℓ^{-}} + p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}}), {(p_{ℓ^{-}} \pm p_{ℓ^{+}})}^{2}, {(p_{b} \pm p_{\bar{b}})}^{2}}

. Since the mixed product itself already has the desired symmetry properties, those 8 quintuple products do not bring additional new information, with respect to a mixed (triple) product that is our final observable:

ω_{22} \sim p_{h} \times (p_{ℓ^{-}} + p_{ℓ^{+}}) \cdot (p_{b} + p_{\bar{b}}) .

(50)

Note that there are nonlinear relations between

ω

’s, such as

ω_{1} ω_{6} = ω_{3} ω_{4}

, that we do not exploit to further reduce the set. Namely, ratios of

ω

’s can contain singularities in the available phase-space and as such would be difficult to reconstruct by a neural network optimizer.

All the

ω

’s are normalized by the lengths of the vectors that enter as factors in the scalar products,

ω_{i} = \frac{[V_{1} \times V_{2} \cdot V_{3}] [V_{4} \cdot V_{5}]}{| V_{1} \times V_{2} | | V_{3} | | V_{4} | | V_{5} |},

(51)

and the upper bound

| ω_{i} | \leq 1

is generally valid. For cases when

ω_{i}

has a vector

a

present both in the mixed and scalar products, e.g.,

(V_{1} \times V_{2} \cdot a) (V_{3} \cdot a)

, and furthermore with

V_{1} \times V_{2} \cdot V_{3} = 0

, a stricter upper bound

| ω_{i} | \leq 1 / 2

applies (for

ω_{1, 6, 7, 11, 13, 14, 15, 16, 20, 21}

).

Having constructed manifestly

C P

-odd variables

ω

, we now show how they can be used to extract information on

\tilde{κ}

in an optimal way. The

t \bar{t} h

production cross section can be written as

\frac{d σ}{d x d ω_{i}} = A (x, | ω_{i} |) + \tilde{κ} κ B (x, ω_{i}) .

(52)

where

x

are

C P

-even phase space variables, A is a manifestly

C P

-even and B is a manifestly

C P

-odd function of

ω_{i}

:

B (x, ω_{i}) = - B (x, - ω_{i})

. The

κ \tilde{κ}

dependence is due to the interference of scalar and pseudoscalar amplitudes. A simple

C P

-odd observable is an average of a single variable

ω_{i}

O_{i} = \frac{1}{σ} \int d x d ω_{i} \frac{d σ}{d x d ω_{i}} ω_{i} = \frac{1}{N} \sum_{j = 1} ω_{i}^{(j)},

(53)

where N is the number of experimental events. The standard deviation

σ_{i}

of such an observable is given by

σ_{i}^{2} = \frac{1}{N} [\frac{1}{N} \sum_{j} {(ω_{i}^{(j)})}^{2} - O_{i}^{2}] .

(54)

For large enough N the distribution of

O_{i}

is approximately Gaussian and the corresponding significance of this observable is:

{Sig}_{i} \equiv \frac{O_{i}}{σ_{i}} = \sqrt{N} \frac{O_{i}}{\sqrt{\frac{1}{N} \sum_{j} {(ω_{i}^{(j)})}^{2} - O_{i}^{2}}} .

(55)

Equation (55) holds for

ω_{1}, . . ., ω_{22}

and their linear combinations. By studying the behavior of all 22 such observables

O_{i}

on MC simulations of

p p \to t \bar{t} h

with semi-leptonically decaying top quarks we have found that

O_{6}

and

O_{14}

are the most promising in terms of their significance. Our next aim is to show how one can achieve better sensitivity to

\tilde{κ}

by introducing optimized observables in two directions: the phase-space optimization of a single

ω_{i}

and a construction of an optimal

C P

-odd observable as a combination of all available

ω

s.

3.2. NN-Based Optimized $C P$ -Odd Observables

Due to the high dimensionality and complexity of the phase-space in the

t \bar{t} h

process with top quarks decaying semileptonically, we rely on neural networks (NN) trained on Monte-Carlo generated samples to efficiently parametrize the weight function of events across the multi-dimensional phase-space in order to maximize the statistical sensitivity to

\tilde{κ}

. We note that existing general purpose ML inference tools are already able to optimize sensitivity to a given parameter (see e.g., Ref. [51]). The purpose of our work was to do so in an economic way that manifestly respects the symmetries of the problem.

We implemented the training and evaluation of neural networks using the

TensorFlow

framework [52]. In all cases, we used a sample of

10^{7}

p p \to h t (\to b ℓ^{+} ν) \bar{t} (\to \bar{b} ℓ^{-} \bar{ν})

events generated at LO using

Madgraph 5

[35] together with the Higgs Characterisation UFO model [37,38] with

κ, \tilde{κ} = 1

. We split the sample into separate training (7.5 M) and test (2.5 M) samples. After training at fixed

\tilde{κ} = 1

we also tested the observables at other values of

\tilde{κ}

and

κ

, both ranging from

- 1

to 1. In these tests 1 M events have been used.Unless stated otherwise the results are shown for events in

p p

collisions at 14 TeV. We randomly initialized the neural network weights using the default Glorot uniform initializer and used the Adam optimizer with a custom varying learning rate

l (e) = l (e - 1) / (1 + 0 . 8^{e})

where e is the current epoch and the initial learning rate is set to

0.1

. We use

relu

for the activation function. We trained all networks using the novel loss function

loss (α) = {(\frac{mean (F (X; α))}{std (F (X; α)) / \sqrt{N}})}^{- 2},

(56)

where the

mean ()

and the standard deviation

std ()

are to be calculated over all events in the sample. The loss corresponds to the inverse of the significance-squared of the observable

mean (F (X; α))

that should be minimized in order to achieve optimal statistical sensitivity. Here N is the size of the event sample,

α

are the free neural network weights and biases and

X

stands for the values of

C P

-even and/or

C P

-odd phase-space variables in the given event. We emphasize that such a choice of the loss function is unique to the problem at hand - we are using the optimization procedure to directly maximize the significance of each considered observable. We can avoid over-fitting of the training sample by stopping the training when at least 30 epochs have passed and one of the following two criteria is satisfied: either the running average of 20 training losses saturates to

0.5 %

or the running average of 20 test losses increases for 5 epochs in a row. We keep a model history and in the end choose the best model in terms of test loss. In practice we found that mostly the first condition terminates the training loop, and the best model is usually the model from the final epoch of training. In order to determine the optimal NN architecture we performed a scan over a set of possible NN configurations with up to 2 hidden layers and up to 9 nodes per NN layer. (We have also considered an automated algorithm to determine the optimal NN architecture (i.e., Hyperopt [53], see also Ref. [54] for one of its recent uses.). Here instead we present results of manual scans over a set of possible NN configurations in order to have better control over the NN parameters. We found the results of both approaches comparable. We choose this cutoff for representational purposes, however we have checked that our results do not change significantly when using larger networks, namely up to three hidden layers of 30 nodes each.)

3.2.1. Phase-Space Optimization of a Single $ω$

First we present the optimization of the

ω_{6}

and

ω_{14}

variables based on phase-space averaging. We do not follow the optimization procedure based on separating A and B in Equation (52) since this would require cumbersome multidimensional binning [33]. We use a vector of easily accessible

C P

-even Mandelstam variables

x

:

x = (\begin{matrix} (p_{ℓ^{+}} + p_{ℓ^{-}}) \cdot p_{h} \\ (p_{ℓ^{+}} + p_{ℓ^{-}}) \cdot (p_{b} + p_{\bar{b}}) \\ (p_{b} + p_{\bar{b}}) \cdot p_{h} \\ p_{ℓ^{+}} \cdot p_{ℓ^{-}} \\ p_{b} \cdot p_{\bar{b}} \end{matrix}) .

(57)

Our goal is to find the optimal

C P

-even weight function

f (x; α)

, which should be used to calculate the weighted average of

ω_{i}

. The function f takes

C P

-even quantities

x

as inputs, therefore we expect its dependence on

\tilde{κ}

to be of the form

f (x; α) = C (x; α) + {\tilde{κ}}^{2} D (x; α) + O ({\tilde{κ}}^{4}) .

(58)

Using (52) we can now express the observable

\begin{matrix} ⟨ f (x; α) ω_{i} ⟩ & = \frac{1}{σ} \int \frac{d σ}{d x d ω_{i}} f (x; α) ω_{i} d x d ω_{i} \\ = \frac{\tilde{κ} κ}{σ} \int B (x, ω_{i}) C (x; α) ω_{i} d x d ω_{i} + \frac{{\tilde{κ}}^{3} κ}{σ} \int B (x, ω_{i}) D (x; α) ω_{i} d x d ω_{i} + O ({\tilde{κ}}^{5}), \end{matrix}

(59)

with the usual definition of the average (The phase space average of a function is defined as

⟨ # ⟩ \equiv \frac{1}{σ} \int \frac{d σ}{d x d ω} # d x d ω

). The integration region in

x - ω_{i}

space is symmetric with respect to the transformation

ω_{i} \to - ω_{i}

, therefore integrals of the arguments which are anti-symmetric in

ω_{i}

vanish. Due to this reason all contributions to the expected value of the observable

⟨ f (x; α) ω_{i} ⟩

are proportional to odd powers of

\tilde{κ}

. The large dimensionality of the phase-space suggests the parameterisation of the function

f (x; α)

by means of an appropriate NN. In terms of the loss function (56) we have

F (x, ω_{i}; α) = f (x; α) ω_{i}

.

To understand the impact of using different possible neural network architectures, we have performed a manual scan over a set of neural network configurations. The input layer has 5 nodes (one per each

x

component) and the output layer has one node resulting in a scalar

f (x; α)

. We studied networks with a single hidden layer of 1–9 nodes and double hidden layer networks with 1–9 nodes each, constraining the number of nodes on the second hidden layer to be smaller than or at most equal to the number of nodes on the first hidden layer. The results of the converged test losses of 50 different random weight initializations per configuration are shown on Figure 6 in the purple box plot for the case of

ω_{6}

and the orange box plot for the case of

ω_{14}

. The plain

ω_{6}

-based observable is shown in gray, with the dashed lines denoting its

1 σ

statistical uncertainty, while the same holds true for plain

ω_{14}

in black. We found that the phase-space optimization of

ω_{6}

gives a noticeable improvement over plain

ω_{6}

when using a large enough network. Interestingly,

ω_{14}

seems to be close to optimal on its own, as the phase space optimization does not introduce noticeable improvement. Moreover, the optimized

ω_{6}

gives a similar performance to

ω_{14}

, hinting that we have reached maximal performance achievable with a single

ω

.

Figure 6. A scan in terms of the test loss (sample size 2.5M,

κ, \tilde{κ} = 1

) over neural network configurations with one (upper plot) or two (lower plot) hidden layers for the phase-space optimized

ω_{6}

and

ω_{14}

(59) shown in the purple and orange box plot and the generalized

F (ω; α)

(Section 3.2.2) shown in the blue box plot. The spread in all cases corresponds to 50 different random weight initializations per configuration. For comparison the plain

ω_{6}

and

ω_{14}

are shown in gray and black with the dashed lines showing their

1 σ

statistical uncertainty. The first order approximation of

F (ω; α)

, defined in Equation (60), is shown in red as described in Section 3.2.3 [27].

To test how well the resulting networks generalize to other values of

\tilde{κ}

we used the 50 converged

{9, 9}

models to calculate the dependence of the resulting observable significances with respect to

\tilde{κ}

on the aforementioned fresh sets of 1 M events per

\tilde{κ}

. This is shown on Figure 7 where a consistent improvement over simple

⟨ ω_{6} ⟩

can be seen at all considered

\tilde{κ}

, while again a marginal improvement is confirmed for

ω_{14}

, with the optimized

ω_{6}

hovering around

ω_{14}

.

Figure 7. Comparison of the significances (defined as the mean value divided by the standard deviation) of all the observables considered in this work with respect to

\tilde{κ}

and at fixed

κ = 1

. The results correspond to 1M events per

\tilde{κ}

at 14 TeV. Plain

ω_{6}

in gray,

ω_{14}

in black, phase-space optimized

ω_{6}

and

ω_{14}

(59) in purple and orange, anti-symmetrized neural network

F (ω; α)

(Section 3.2.2) in blue and the first order approximation of the latter

α \cdot ω

in red (Section 3.2.3). See text for details on each observable [27].

As the results of optimizing single

ω_{6}

and

ω_{14}

point to a maximal performance possible using a single

ω

and the chosen set of phase space variables, we now turn to the rest of the

ω

’s. In the next subsection we consider a more general case where the

C P

-odd observable itself is parameterized with a neural network.

3.2.2. Neural Network as a $C P$ -Odd Observable

We consider the case where the output of the neural network is a

C P

-odd quantity that defines our observable. To this end, we build a network with 22 inputs, one per each

ω_{i}

, and one output

F (ω; α)

, which is correctly anti-symmetrized so that

F (ω; α) = - F (- ω; α)

. In terms of the loss function (56) we have

F (x; α) = F (ω; α)

. Note that since we include the complete irreducible set

ω

in this non-linear construction, it effectively also covers the case of a simple phase-space optimization of any (linear combination of)

ω_{i}

, since all relevant

C P

-even phase-space variables can be recovered by taking suitable products of

ω_{i} ω_{i^{'}}

.

We again carried out the study of the dependence of the network size with respect to the test sample loss, including non-negligible uncertainties associated with random weight initializations. We scanned the neural network architecture parameter space in the same way as in the previous case, starting with a single hidden layer of 1–9 nodes, then adding an additional hidden layer with the number of nodes smaller than or equal to the number of nodes on the first hidden layer. For each configuration we ran 50 trainings with different random weight initializations. The results are shown in Figure 6 in the blue box plot. We find a considerable improvement over the phase-space optimizations of

ω_{6}

or

ω_{14}

. The improvement is consistent in the entire range of

\tilde{κ} \in [0.1, 1.0]

and is most striking at large

\tilde{κ}

.

Again we can check the generalizing power of the resulting observables to other

\tilde{κ}

by fixing the model configuration to

{9, 9}

and calculating the significance of the resulting observables with respect to

\tilde{κ}

. The results are shown on Figure 7. We find a consistent improvement over the previous case across all considered

\tilde{κ}

. A noticeable improvement in the significance can be seen. In order to better understand the physics underlying the optimization, we next consider this model in the leading order approximation in

ω

.

3.2.3. First Order Approximation of $F ($ $ω$ ; $α$ )

To address the arbitrariness of the neural network architecture choice and to better understand the underlying physics, we finally consider the first order approximation of the function

F (ω; α)

, which can be expanded in a Taylor series

F (ω; α) = \sum_{j} α_{j} ω_{j} + O (ω^{3})

for

j \in {1, \dots, 22}

, since for most events the values of the

C P

-odd variables are small:

| ω_{i} | ≪ 1

,

i \in {1, . . ., 22}

. In the first order approximation the optimal

C P

-odd observable can be written as

O_{α \cdot ω} = ⟨ \sum_{j} α_{j} ω_{j} ⟩,

(60)

where

α_{j}

are parameters that satisfy a subsidiary condition

| α | \equiv \sum_{j = 1}^{22} α_{j}^{2} = 1

. Their values can be found by maximizing the significance:

\frac{\partial}{\partial α_{k}} \frac{O_{α \cdot ω}}{std (α \cdot ω)} = 0,

(61)

The left side of the expression above can be expanded in the following way:

\begin{matrix} \frac{\partial}{\partial α_{k}} \frac{O_{α \cdot ω}}{std (α \cdot ω)} & = \frac{2 ⟨ α \cdot ω ⟩}{std (α \cdot ω)} \{⟨ ω_{k} ⟩ ⟨ {(α \cdot ω)}^{2} ⟩ - ⟨ (α \cdot ω) ω_{k} ⟩\} \\ = \frac{2 ⟨ α \cdot ω ⟩}{std (α \cdot ω)} \{\sum_{i, j = 1}^{22} α_{i} α_{j} [⟨ ω_{i} ω_{k} ⟩ ⟨ ω_{j} ⟩ - ⟨ ω_{i} ω_{j} ⟩ ⟨ ω_{k} ⟩]\} . \end{matrix}

(62)

where in the last step we take

α

outside the

⟨ - ⟩

, for example

⟨ α \cdot ω ⟩ = \sum_{j} α_{j} ⟨ ω_{j} ⟩

. By assuming

\tilde{κ} \neq 0

we have

O_{α \cdot ω} \neq 0

. From this, together with the condition in Equation (61), it follows that the expression in the curly brackets of Equation (62) must be equal to 0. Hence, we obtain a system of 22 quadratic equations (The problem is equivalent to a single neuron NN with 22 inputs and one output without the activation function and the bias term)

α^{T} M^{(k)} α = 0,

(63)

where

α = {[α_{1}, . . ., α_{22}]}^{T}

and the

22 \times 22

matrices

M^{(k)}

are given by (Notice that by definition

M_{i k}^{(k)} = 0

for all i and k, therefore

det (M^{(k)}) = 0

for all k)

M_{i j}^{(k)} = ⟨ ω_{i} ω_{k} ⟩ ⟨ ω_{j} ⟩ - ⟨ ω_{i} ω_{j} ⟩ ⟨ ω_{k} ⟩ .

(64)

The system of equations in Equation (63) can be solved numerically. The solution vector

α

is undetermined up to a real non-zero constant. We normalize it such that

α \cdot α = 1

and choose the solution with mostly positive values.

We used this approach to extract the optimal weights

α_{j}

from

10^{7}

events generated with

\tilde{κ} = κ = 1

at 14, 27, and 100 TeV. We also estimated the uncertainty associated with the optimal weights using the following procedure: First we estimate the statistical spread of the significance obtained with optimal

α

. Next we allow a single

α_{j}

to float in the intervals

[α_{j} - σ_{j}, α_{j} + σ_{j}]

, where

σ_{j}

is chosen such that the decrease of the significance due to the change in

α_{j}

corresponds to the statistical spread of the significance. We perform an efficient scan around the optimal vector

α

in its 22-dimensional neighborhood using spherical coordinates to trivially fulfill the normalization constraint

\sum_{j} α_{j}^{2} = 1

. We approximate the significance with a quadratic function around the extremum to find independent, uncorrelated directions in the

α

-space. With this procedure we determine how sharply the optimal

α_{j}

are defined. In practice, we estimated the statistical error of the significance using

10^{7}

events. Clearly the uncertainties

σ_{j}

are larger for smaller chosen sample size. The results of this approach are shown in Figure 8, where the upper (lower) panel shows the estimated error (significance) for each

α_{j}

at 14, 27, and 100 TeV. A comparison of the observable

α \cdot ω

to other approaches discussed previously is shown in Figure 7. We reach a similar level of improvement compared to the full

F (ω; α)

network with significantly fewer parameters.

Figure 8. Optimal weights of the linear observable defined in Equation (60). The upper plot shows uncertainties of

α_{j}

, estimated using the expected statistical errors of the observable significances, see text for details. The lower plot shows the significances of

α_{j}

(defined as their central value divided by their estimated uncertainty) [27].

3.3. Bounds in the $(κ, \tilde{κ})$ Plane

We could now produce the bounds in the

(κ, \tilde{κ})

plane by varying both

κ

and

\tilde{κ}

at generator level and including showering and hadronization effects using

Pythia 8

and detector effects using

Delphes

with the default ATLAS simulation card. As the

t \bar{t} h

is followed by semileptonic top decays and

h \to b \bar{b}

decay, our signal is defined as 4 b-jets and two oppositely charged leptons ℓ. We included the main irreducible background

p p \to t \bar{t} b \bar{b}

with both tops decaying semileptonically and used the event selection requirements:

$\geq 4$ jets of any flavor with $| η (j) | < 5$ and $p_{T} (j) > 20$ GeV.
$\geq 3$ of the above jets are b-tagged.
2 oppositely charged light leptons with $| η (ℓ) | < 2.5$ and $p_{T} (ℓ) > 10$ GeV.

Furthermore, in order to identify the b-jets from t,

\bar{t}

decays we counted the number

N_{b}

of tagged b-jets and performed the following selections: if

N_{b} \geq 4

, we compute the invariant masses

m_{b b}

of all possible b-jet pairs and select the pair with invariant mass closest to the Higgs mass

m_{h} = 125

GeV. If the selected pair falls inside the Higgs mass window (

| m_{b b} - m_{h} | < 15

GeV) we removed the pair from the list of b-jets and selected from this list the highest

p_{T}

b-jets as our candidate top quark decay b-jets. However, if

N_{b} = 3

we computed all possible invariant masses

m_{b j}

where j are non-b jets in the event. We selected as the

h \to b \bar{b}

candidate the

b j

pair that minimizes

| m_{h} - m_{b j} |

and falls inside the Higgs mass window

m_{h} \pm 15

GeV. The remaining two b-jets are taken as the candidate top quark decay b-jets.

We present the bounds for HL- and HE-LHC, FCC-hh by using the optimal observable (60) with the weights shown in the upper plot of Figure 8. The bounds coming from a null result up to the expected statistical uncertainty for different luminosities at different energies are shown on Figure 9, where we also show the expected sensitivity to

(κ, \tilde{κ})

from the

t \bar{t} h

production cross-section measurements using the projected uncertainties of

δ κ / κ \sim (0.04, 0.02, 0.009)

at HL-LHC, HE-LHC [55] and FCC-hh [56], respectively. For direct comparison with [24] we also show the expected bounds from using a single

ω_{6}

and have checked that using

ω_{14}

does not change the single

ω

bounds significantly.

Figure 9. The

2 σ

exclusion zones in the

κ - \tilde{κ}

plane using the optimized observable

α \cdot ω

by assuming a null result at HL-LHC, HE-LHC and FCC-hh for different luminosities (left: 14 TeV; middel: 27 TeV; right: 100 TeV). The projected sensitivity of

t \bar{t} h

production cross-section measurements is shown in red, see text for details. At 14 TeV

O (1)

exclusion of

\tilde{κ}

can be achieved using

α \cdot ω

with

350 {fb}^{- 1}

which corresponds to the final integrated luminosity of the LHC [27].

A consistent improvement of sensitivity compared to using single

ω_{i}

can be achieved by using the optimized combination of

ω

’s, constraining the parameter space significantly in orthogonal directions compared to the cross-section measurements. Interestingly the significance improvement is consistent between partonic events and after including shower, detector effects, as well as the dominant background and realistic object reconstruction, even though the optimization was performed at parton level only. This robustness is a welcome benefit of the method, since the computationally costly optimization procedure does not appear to be sensitive to modeling of the hadronic final states and detector effects. It is also reassuring that the optimization does not significantly rely on specific phase-space regions particularly affected by the background. We expect the results to be also robust against higher order QCD corrections as in the case of

p p \to t h j

, where we have checked this explicitly.

We show the sensitivity of the optimized observable to the sign of

\tilde{κ}

(and

κ

) in Figure 10 by assuming the measurement of a

2 σ

positive statistical fluctuation of the SM case, which in our estimate corresponds to the measurement of

α \cdot ω = (4.6 \pm 2.3) \times 10^{- 4}

,

α \cdot ω = (0.9 \pm 0.45) \times 10^{- 4}

and

α \cdot ω = (0.2 \pm 0.1) \times 10^{- 4}

for HL-LHC (3 ab

^{- 1}

), HE-LHC (15 ab

^{- 1}

) and FCC-hh (30 ab

^{- 1}

) respectively.

Figure 10. The

2 σ

exclusion regions at HL-LHC (3 ab

^{- 1}

), HE-LHC (15 ab

^{- 1}

) and FCC-hh (30 ab

^{- 1}

) by assuming a measurement of a

2 σ

positive fluctuation in the optimal observable

α \cdot ω

(60) [27].

4. Conclusions

In order to establish, directly and with minimal additional assumptions, the presence of a

C P

-odd component of the top quark Yukawa (

\tilde{κ}

), we have studied manifestly

C P

-odd observables in

t h

and

t \bar{t} h

production at the LHC and its prospective upgrades.

For the

t h j

final states we have relied on the possibility of reconstructing the t quark momentum and accessing the t polarization. We have identified a particular polarization direction which is perpendicular to the

t h

plane, where the top polarization along this direction would undoubtedly point to the presence of the

C P

-odd coupling

\tilde{κ}

. We have presented a method for optimizing the phase space dependent weight and shown its sensitivity at the HL- and HE-LHC for the semileptonic top and

h \to b \bar{b}

mode. The handful of signal events offer discriminating power, sensitive to the sign of

\tilde{κ}

, however the irreducible background due to

t \bar{t}

+jets severely dilutes the sensitivity of the proposed observable.

On the other hand,

t \bar{t} h

production has a considerably larger cross section at LHC energies compared to

t h j

, while suffering more moderately from irreducible backgrounds. Due to the complexity of the final state kinematics with multiple undetected particles we have in this case proposed variables

ω_{i}

that only depend on the lab-frame accessible momenta and are manifestly P- and

C P

-odd. Furthermore, we have studied the prospect of their phase-space optimization, parameterizing the optimal weight functions with neural networks. In particular, we have studied a general

C P

-odd observable, parameterized directly by a

C P

anti-symmetric neural network, which results in better performance compared to any individual

ω_{i}

. Finally, we have studied the first order approximation of this network as a linear combination of the

C P

-odd observables, producing a simpler and more robust observable,

α \cdot ω

. One benefit of using

α \cdot ω

optimized at parton level is that it retains close to optimal sensitivity to the

C P

-odd coupling even after detector simulation, event selection and reconstruction, allowing to probe

\tilde{κ}

directly at HL-LHC, HE-LHC and FCC-hh. We have found that, at the end of Run 3, the LHC will exclude

κ \tilde{κ} \sim 1

with

2 σ

confidence, while FCC-hh will be sensitive to

κ \tilde{κ} \sim 0.01

. Note that these observables represent highly complementary probes of the top Yukawa sector compared to

p p \to t \bar{t} h

cross-section measurements. In particular in their optimized form they would allow to break the degeneracy in the

(κ, \tilde{κ})

plane and significantly reduce the allowed parameter space even at modest sensitivities accessible at the (HL)LHC.

Author Contributions

Conceptualization, J.F.K. and N.K.; methodology, D.A.F. and A.S.; software, B.B. and A.S.; validation, D.A.F., J.F.K. and N.K.; formal analysis, N.K. and A.S.; data curation, A.S.; writing—original draft preparation, B.B.; writing—review and editing, J.F.K., N.K. and A.S.; visualization, A.S.; supervision, J.F.K. and N.K.; funding acquisition, J.F.K. and N.K. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the financial support from the Slovenian Research Agency (research core funding No. P1-0035). This article is based upon work from COST Action CA16201 PARTICLEFACE supported by COST (European Cooperation in Science and Technology).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Aguilar-Saavedra, J.A. A Minimal set of top-Higgs anomalous couplings. Nucl. Phys. 2009, B821, 215–227. [Google Scholar] [CrossRef] [Green Version]
Ellis, J.; Hwang, D.S.; Sakurai, K.; Takeuchi, M. Disentangling Higgs-Top Couplings in Associated Production. J. High Energy Phys. 2014, 2014, 4. [Google Scholar] [CrossRef] [Green Version]
Aad, G.; Abbott, B.; Abdallah, J.; Abdinov, O.; Abeloos, B.; Aben, R.; AbouZeid, O.S.; Abraham, N.L.; Abramowicz, H.; Abreu, H.; et al. Measurements of the Higgs boson production and decay rates and constraints on its couplings from a combined ATLAS and CMS analysis of the LHC pp collision data at s=7 and 8 TeV. J. High Energy Phys. 2016, 8, 45. [Google Scholar] [CrossRef]
Bhattacharyya, G.; Das, D.; Pal, P.B. Modified Higgs couplings and unitarity violation. Phys. Rev. D 2013, 87, 011702. [Google Scholar] [CrossRef] [Green Version]
Brod, J.; Haisch, U.; Zupan, J. Constraints on CP-violating Higgs couplings to the third generation. J. High Energy Phys. 2013, 11, 180. [Google Scholar] [CrossRef] [Green Version]
Boudjema, F.; Godbole, R.M.; Guadagnoli, D.; Mohan, K.A. Lab-frame observables for probing the top-Higgs interaction. Phys. Rev. D 2015, 92, 015019. [Google Scholar] [CrossRef] [Green Version]
Grzadkowski, B.; Gunion, J.F. Using decay angle correlations to detect CP violation in the neutral Higgs sector. Phys. Lett. B 1995, 350, 218–224. [Google Scholar] [CrossRef] [Green Version]
Demartin, F.; Maltoni, F.; Mawatari, K.; Page, B.; Zaro, M. Higgs characterisation at NLO in QCD: CP properties of the top-quark Yukawa interaction. Eur. Phys. J. C 2014, 74, 3065. [Google Scholar] [CrossRef] [Green Version]
Buckley, M.R.; Goncalves, D. Boosting the Direct CP Measurement of the Higgs-Top Coupling. Phys. Rev. Lett. 2016, 116, 091801. [Google Scholar] [CrossRef] [Green Version]
Mileo, N.; Kiers, K.; Szynkman, A.; Crane, D.; Gegner, E. Pseudoscalar top-Higgs coupling: Exploration of CP-odd observables to resolve the sign ambiguity. J. High Energy Phys. 2016, 7, 056. [Google Scholar] [CrossRef] [Green Version]
Gritsan, A.V.; Röntsch, R.; Schulze, M.; Xiao, M. Constraining anomalous Higgs boson couplings to the heavy flavor fermions using matrix element techniques. Phys. Rev. D 2016, 94, 055023. [Google Scholar] [CrossRef] [Green Version]
Li, J.; Si, Z.g.; Wu, L.; Yue, J. Central-edge asymmetry as a probe of Higgs-top coupling in tt¯h production at the LHC. Phys. Lett. B 2018, 779, 72–76. [Google Scholar] [CrossRef]
Dos Santos, S.A.; Fiolhais, M.C.N.; Frederix, R.; Gonçalo, R.; Gouveia, E.; Martins, R.; Onofre, A.; Pease, C.; Peixoto, H.; Reigoto, A.; et al. Probing the CP nature of the Higgs coupling in tt¯h events at the LHC. Phys. Rev. D 2017, 96, 013004. [Google Scholar] [CrossRef] [Green Version]
Gonçalves, D.; Kong, K.; Kim, J.H. Probing the top-Higgs Yukawa CP structure in dileptonic tt¯h with M₂-assisted reconstruction. J. High Energy Phys. 2018, 6, 079. [Google Scholar] [CrossRef] [Green Version]
Kobakhidze, A.; Wu, L.; Yue, J. Anomalous Top-Higgs Couplings and Top Polarisation in Single Top and Higgs Associated Production at the LHC. J. High Energy Phys. 2014, 10, 100. [Google Scholar] [CrossRef] [Green Version]
Yue, J. Enhanced thj signal at the LHC with h→γγ decay and CP-violating top-Higgs coupling. Phys. Lett. B 2015, 744, 131–136. [Google Scholar] [CrossRef] [Green Version]
Demartin, F.; Maltoni, F.; Mawatari, K.; Zaro, M. Higgs production in association with a single top quark at the LHC. Eur. Phys. J. C 2015, 75, 267. [Google Scholar] [CrossRef] [Green Version]
Barger, V.; Hagiwara, K.; Zheng, Y.J. Probing the Higgs Yukawa coupling to the top quark at the LHC via single top+Higgs production. Phys. Rev. D 2019, 99, 031701. [Google Scholar] [CrossRef] [Green Version]
Kraus, M.; Martini, T.; Peitzsch, S.; Uwer, P. Exploring BSM Higgs couplings in single top-quark production. arXiv 2019, arXiv:1908.09100. [Google Scholar]
Bernreuther, W.; Brandenburg, A. Tracing CP violation in the production of top quark pairs by multiple TeV proton proton collisions. Phys. Rev. D 1994, 49, 4481–4492. [Google Scholar] [CrossRef] [Green Version]
Barger, V.; Hagiwara, K.; Zheng, Y.J. Probing the top Yukawa coupling at the LHC via associated production of single top and Higgs. arXiv 2019, arXiv:1912.11795. [Google Scholar]
Patrick, R.; Scaffidi, A.; Sharma, P. Top polarisation as a probe of CP-mixing top-Higgs coupling in tjh signals. Phys. Rev. D 2020, 101, 093005. [Google Scholar] [CrossRef]
Ferroglia, A.; Fiolhais, M.C.; Gouveia, E.; Onofre, A. Role of the tt¯h rest frame in direct top-quark Yukawa coupling measurements. Phys. Rev. D 2019, 100, 075034. [Google Scholar] [CrossRef] [Green Version]
Faroughy, D.A.; Kamenik, J.F.; Košnik, N.; Smolkovič, A. Probing the CP nature of the top quark Yukawa at hadron colliders. J. High Energy Phys. 2020, 2, 085. [Google Scholar] [CrossRef] [Green Version]
Fajfer, S.; Kamenik, J.F.; Melic, B. Discerning New Physics in Top-Antitop Production using Top Spin Observables at Hadron Colliders. J. High Energy Phys. 2012, 8, 114. [Google Scholar] [CrossRef] [Green Version]
Durieux, G.; Grossman, Y. Probing CP violation systematically in differential distributions. Phys. Rev. D 2015, 92, 076013. [Google Scholar] [CrossRef] [Green Version]
Bortolato, B.; Kamenik, J.F.; Košnik, N.; Smolkovič, A. Optimized probes of CP -odd effects in the tt¯h process at hadron colliders. Nucl. Phys. B 2021, 964, 115328. [Google Scholar] [CrossRef]
Mangano, M.L.; Zanderighi, G.; Saavedra, J.A.A.; Alekhin, S.; Badger, S.; Bauer, C.W.; Becher, T.; Bertone, V.; Bonvini, M.; Boselli, S.; et al. Physics at a 100 TeV pp Collider: Standard Model Processes. CERN Yellow Rep. 2017, 1–254. [Google Scholar] [CrossRef]
Dicus, D.A.; Sudarshan, E.C.G.; Tata, X. Factorization Theorem for Decaying Spinning Particles. Phys. Lett. 1985, 154B, 79–85. [Google Scholar] [CrossRef]
Bernreuther, W.; Brandenburg, A.; Si, Z.G.; Uwer, P. Top quark pair production and decay at hadron colliders. Nucl. Phys. 2004, B690, 81–137. [Google Scholar] [CrossRef] [Green Version]
Bjorken, J.D.; Drell, S.D. Relativistic Quantum Mechanics; International Series in Pure and Applied Physics; McGraw-Hill: New York, NY, USA, 1965. [Google Scholar]
Atwood, D.; Bar-Shalom, S.; Eilam, G.; Soni, A. CP violation in top physics. Phys. Rept. 2001, 347, 1–222. [Google Scholar] [CrossRef] [Green Version]
Atwood, D.; Soni, A. Analysis for magnetic moment and electric dipole moment form-factors of the top quark via e⁺e^-→tt¯. Phys. Rev. D 1992, 45, 2405–2413. [Google Scholar] [CrossRef]
Gunion, J.F.; Grzadkowski, B.; He, X.G. Determining the top - anti-top and Z Z couplings of a neutral Higgs boson of arbitrary CP nature at the NLC. Phys. Rev. Lett. 1996, 77, 5172–5175. [Google Scholar] [CrossRef] [Green Version]
Alwall, J.; Frederix, R.; Frixione, S.; Hirschi, V.; Maltoni, F.; Mattelaer, O.; Shao, H.S.; Stelzer, T.; Torrielli, P.; Zaro, M. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations. J. High Energy Phys. 2014, 7, 79. [Google Scholar] [CrossRef] [Green Version]
Artoisenet, P.; Frederix, R.; Mattelaer, O.; Rietkerk, R. Automatic spin-entangled decays of heavy resonances in Monte Carlo simulations. J. High Energy Phys. 2013, 3, 15. [Google Scholar] [CrossRef] [Green Version]
Degrande, C.; Duhr, C.; Fuks, B.; Grellscheid, D.; Mattelaer, O.; Reiter, T. UFO - The Universal FeynRules Output. Comput. Phys. Commun. 2012, 183, 1201–1214. [Google Scholar] [CrossRef] [Green Version]
Artoisenet, P.; de Aquino, P.; Demartin, F.; Frederix, R.; Frixione, S.; Maltoni, F.; Mandal, M.K.; Mathews, P.; Mawatari, K.; Ravindran, V.; et al. A framework for Higgs characterisation. J. High Energy Phys. 2013, 11, 43. [Google Scholar] [CrossRef] [Green Version]
Broggio, A.; Ferroglia, A.; Frederix, R.; Pagani, D.; Pecjak, B.D.; Tsinikos, I. Top-quark pair hadroproduction in association with a heavy boson at NLO+NNLL including EW corrections. J. High Energy Phys. 2019, 8, 39. [Google Scholar] [CrossRef] [Green Version]
Sjostrand, T.; Mrenna, S.; Skands, P.Z. A Brief Introduction to PYTHIA 8.1. Comput. Phys. Commun. 2008, 178, 852–867. [Google Scholar] [CrossRef] [Green Version]
Cacciari, M.; Salam, G.P.; Soyez, G. FastJet User Manual. Eur. Phys. J. 2012, C72, 1896. [Google Scholar] [CrossRef] [Green Version]
De Favereau, J.; Delaere, C.; Demin, P.; Giammanco, A.; Lemaître, V.; Mertens, A.; Selvaggi, M. DELPHES 3, A modular framework for fast simulation of a generic collider experiment. J. High Energy Phys. 2014, 2, 57. [Google Scholar] [CrossRef] [Green Version]
Mangano, M.L.; Moretti, M.; Piccinini, F.; Treccani, M. Matching matrix elements and shower evolution for top-quark production in hadronic collisions. J. High Energy Phys. 2007, 1, 13. [Google Scholar] [CrossRef]
Farina, M.; Grojean, C.; Maltoni, F.; Salvioni, E.; Thamm, A. Lifting degeneracies in Higgs couplings using single top production in association with a Higgs boson. J. High Energy Phys. 2013, 5, 22. [Google Scholar] [CrossRef] [Green Version]
Aaboud, M.; Aad, G.; Abbott, B.; Abdinov, O.; Abeloos, B.; Abidi, S.; AbouZeid, O.; Abraham, N.; Abramowicz, H.; Abreu, H.; et al. Search for the standard model Higgs boson produced in association with top quarks and decaying into a bb¯ pair in pp collisions at s = 13 TeV with the ATLAS detector. Phys. Rev. 2018, D97, 072016. [Google Scholar] [CrossRef] [Green Version]
ATLAS Collaboration. Observation of Higgs boson production in association with a top quark pair at the LHC with the ATLAS detector. Phys. Lett. 2018, B784, 173–191. [Google Scholar] [CrossRef]
Krohn, D.; Schwartz, M.D.; Lin, T.; Waalewijn, W.J. Jet Charge at the LHC. Phys. Rev. Lett. 2013, 110, 212001. [Google Scholar] [CrossRef] [PubMed]
Fraser, K.; Schwartz, M.D. Jet Charge and Machine Learning. J. High Energy Phys. 2018, 10, 93. [Google Scholar] [CrossRef] [Green Version]
A New Tagger for the Charge Identification of b-Jets; Technical Report ATL-PHYS-PUB-2015-040; CERN: Geneva, Switzerland, 2015.
Measurement of the Jet Vertex Charge Algorithm Performance for Identified b-Jets in tt¯ Events in pp Collisions with the ATLAS Detector; Technical Report ATLAS-CONF-2018-022; CERN: Geneva, Switzerland, 2018.
Brehmer, J.; Kling, F.; Espejo, I.; Cranmer, K. MadMiner: Machine learning-based inference for particle physics. Comput. Softw. Big Sci. 2020, 4, 3. [Google Scholar] [CrossRef] [Green Version]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: tensorflow.org (accessed on 21 May 2021).
Bergstra, J.; Yamins, D.; Cox, D.D. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In Proceedings of the 30th International Conference on International Conference on Machine Learning (ICML’13)—Volume 28, Atlanta, GA, USA, 16–21 June 2013; pp. 115–123. [Google Scholar]
Clavijo, J.M.; Glaysher, P.; Katzy, J.M. Adversarial domain adaptation to reduce sample bias of a high energy physics classifier. arXiv 2020, arXiv:2005.00568. [Google Scholar]
Cepeda, M.; Gori, S.; Ilten, P.; Kado, M.; Riva, F.; Abdul Khalek, R.; Aboubrahim, A.; Alimena, J.; Alioli, S.; Alves, A.; et al. Report from Working Group 2: Higgs Physics at the HL-LHC and HE-LHC. In Report on the Physics at the HL-LHC, and Perspectives for the HE-LHC; Dainese, A., Mangano, M., Meyer, A.B., Nisati, A., Salam, G., Vesterinen, M.A., Eds.; 2019; Volume 7, pp. 221–584. Available online: https://e-publishing.cern.ch/index.php/CYRM/article/view/952 (accessed on 21 May 2021). [CrossRef]
Abada, A.; Abbrescia, M.; AbdusSalam, S.S.; Abdyukhanov, I.; Fernandez, J.A.; Abramov, A.; Aburaia, M.; Acar, A.O.; Adzic, P.R.; Agrawal, P.; et al. FCC Physics Opportunities: Future Circular Collider Conceptual Design Report Volume 1. Eur. Phys. J. C 2019, 79, 474. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Tree-level diagrams contributing to

W b \to t h

.

Figure 2. Comparison of the

β (x)

(dashed and dotted) and

\tilde{β} (\tilde{x})

(full line) polarization functions at representative CMS energies

\sqrt{s}

and two values of

κ

. We find that

\tilde{β} (\tilde{x})

is independent of

κ

[24].

Figure 3. Comparison of the optimal weight

B / A

between the

p p \to t h j

and

p p \to \bar{t} h j

processes extracted from MC simulations (left). The right panel shows the comparison between 14 and

27 TeV

proton collision energies for

p p \to t h j

. All plots are obtained using

\tilde{κ} = 1

and with

10^{6}

MC events [24].

Figure 4. Left: comparison of the optimized spin observable (blue dots) with the naïve observable (black dots) extracted from 3000

p p \to t h j, t \to b ℓ ν

MC events at each choice of

\tilde{κ}

. Right: comparison of the significance (defined as the mean value divided by the standard deviation) per

\sqrt{N}

of the two observables, where N is the number of events [24].

Figure 5. Bounds in the

(κ, \tilde{κ})

plane using the optimized observable

{\tilde{O}}_{opt}

for the single-top associated production with a Higgs boson. The blue shaded region corresponds to the

2 σ

(

χ^{2} > 6.18

) exclusion zone assuming the measurement of the SM at the HE-LHC (15 ab

^{- 1}

). The black line and stripes shows the

2 σ

excluded region for a

2 σ

positive fluctuation at the HE-LHC (see text for details) [24].

Figure 6. A scan in terms of the test loss (sample size 2.5M,

κ, \tilde{κ} = 1

) over neural network configurations with one (upper plot) or two (lower plot) hidden layers for the phase-space optimized

ω_{6}

and

ω_{14}

(59) shown in the purple and orange box plot and the generalized

F (ω; α)

(Section 3.2.2) shown in the blue box plot. The spread in all cases corresponds to 50 different random weight initializations per configuration. For comparison the plain

ω_{6}

and

ω_{14}

are shown in gray and black with the dashed lines showing their

1 σ

statistical uncertainty. The first order approximation of

F (ω; α)

, defined in Equation (60), is shown in red as described in Section 3.2.3 [27].

Figure 7. Comparison of the significances (defined as the mean value divided by the standard deviation) of all the observables considered in this work with respect to

\tilde{κ}

and at fixed

κ = 1

. The results correspond to 1M events per

\tilde{κ}

at 14 TeV. Plain

ω_{6}

in gray,

ω_{14}

in black, phase-space optimized

ω_{6}

and

ω_{14}

(59) in purple and orange, anti-symmetrized neural network

F (ω; α)

(Section 3.2.2) in blue and the first order approximation of the latter

α \cdot ω

in red (Section 3.2.3). See text for details on each observable [27].

Figure 8. Optimal weights of the linear observable defined in Equation (60). The upper plot shows uncertainties of

α_{j}

, estimated using the expected statistical errors of the observable significances, see text for details. The lower plot shows the significances of

α_{j}

(defined as their central value divided by their estimated uncertainty) [27].

Figure 9. The

2 σ

exclusion zones in the

κ - \tilde{κ}

plane using the optimized observable

α \cdot ω

by assuming a null result at HL-LHC, HE-LHC and FCC-hh for different luminosities (left: 14 TeV; middel: 27 TeV; right: 100 TeV). The projected sensitivity of

t \bar{t} h

production cross-section measurements is shown in red, see text for details. At 14 TeV

O (1)

exclusion of

\tilde{κ}

can be achieved using

α \cdot ω

with

350 {fb}^{- 1}

which corresponds to the final integrated luminosity of the LHC [27].

Figure 10. The

2 σ

exclusion regions at HL-LHC (3 ab

^{- 1}

), HE-LHC (15 ab

^{- 1}

) and FCC-hh (30 ab

^{- 1}

) by assuming a measurement of a

2 σ

positive fluctuation in the optimal observable

α \cdot ω

(60) [27].

Table 1. Vector quantities with well-defined C and P eigenvalues in 3 dimensional Euclidean space. More complicated objects with well-defined C and P eigenvalues can be constructed using variables in this table.

	p_h	$p_{ℓ^{-}} + p_{ℓ^{+}}$	$p_{ℓ^{-}} - p_{ℓ^{+}}$	$p_{b} + p_{\bar{b}}$	$p_{b} - p_{\bar{b}}$
C	+	+	−	+	−
P	−	−	−	−	−
$C P$	−	−	+	−	+

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Optimized Probes of the CP Nature of the Top Quark Yukawa Coupling at Hadron Colliders

Abstract

1. Introduction

2. Optimal CP-Odd Observable in the bW → th Process

2.1. Parton Level $W b \to t h$ Analysis

2.2. Hadronic Process $p p \to t h j$

2.3. Limits in the $(κ, \tilde{κ})$ Plane from $p p \to t h j$ at Event Reconstruction Level

3. CP-Odd Observables in the $pp \to t \bar{t} h$ Process

3.1. Laboratory Frame $C P$ -Odd Observables in $t \bar{t} h$ Production

3.2. NN-Based Optimized $C P$ -Odd Observables

3.2.1. Phase-Space Optimization of a Single $ω$

3.2.2. Neural Network as a $C P$ -Odd Observable

3.2.3. First Order Approximation of $F ($ $ω$ ; $α$ )

3.3. Bounds in the $(κ, \tilde{κ})$ Plane

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

Optimized Probes of the CP Nature of the Top Quark Yukawa Coupling at Hadron Colliders

Abstract

1. Introduction

2. Optimal CP-Odd Observable in the bW → th Process

2.1. Parton Level W b → t h Analysis

2.2. Hadronic Process p p → t h j

2.3. Limits in the ( κ , κ ˜ ) Plane from p p → t h j at Event Reconstruction Level

3. CP-Odd Observables in the pp → t t ¯ h Process

3.1. Laboratory Frame C P -Odd Observables in t t ¯ h Production

3.2. NN-Based Optimized C P -Odd Observables

3.2.1. Phase-Space Optimization of a Single ω

3.2.2. Neural Network as a C P -Odd Observable

3.2.3. First Order Approximation of F ( ω ; α )

3.3. Bounds in the ( κ , κ ˜ ) Plane

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

2.1. Parton Level $W b \to t h$ Analysis

2.2. Hadronic Process $p p \to t h j$

2.3. Limits in the $(κ, \tilde{κ})$ Plane from $p p \to t h j$ at Event Reconstruction Level

3. CP-Odd Observables in the $pp \to t \bar{t} h$ Process

3.1. Laboratory Frame $C P$ -Odd Observables in $t \bar{t} h$ Production

3.2. NN-Based Optimized $C P$ -Odd Observables

3.2.1. Phase-Space Optimization of a Single $ω$

3.2.2. Neural Network as a $C P$ -Odd Observable

3.2.3. First Order Approximation of $F ($ $ω$ ; $α$ )

3.3. Bounds in the $(κ, \tilde{κ})$ Plane