Theory of Excitons in Atomically Thin Semiconductors: Tight-Binding Approach

Atomically thin semiconductors from the transition metal dichalcogenide family are materials in which the optical response is dominated by strongly bound excitonic complexes. Here, we present a theory of excitons in two-dimensional semiconductors using a tight-binding model of the electronic structure. In the first part, we review extensive literature on 2D van der Waals materials, with particular focus on their optical response from both experimental and theoretical points of view. In the second part, we discuss our ab initio calculations of the electronic structure of MoS2, representative of a wide class of materials, and review our minimal tight-binding model, which reproduces low-energy physics around the Fermi level and, at the same time, allows for the understanding of their electronic structure. Next, we describe how electron-hole pair excitations from the mean-field-level ground state are constructed. The electron–electron interactions mix the electron-hole pair excitations, resulting in excitonic wave functions and energies obtained by solving the Bethe–Salpeter equation. This is enabled by the efficient computation of the Coulomb matrix elements optimized for two-dimensional crystals. Next, we discuss non-local screening in various geometries usually used in experiments. We conclude with a discussion of the fine structure and excited excitonic spectra. In particular, we discuss the effect of band nesting on the exciton fine structure; Coulomb interactions; and the topology of the wave functions, screening and dielectric environment. Finally, we follow by adding another layer and discuss excitons in heterostructures built from two-dimensional semiconductors.


2D van der Waals Materials
The fact that many crystals have a layered form with strong intralayer and weak interlayer bonds is well known [1][2][3]. Renewed widespread interest in such crystals reignited around 2005 due to the realization that high-quality few-layer crystals can be obtained via a mechanical exfoliation process [4][5][6]. Immediately, it was realized that other materials can be obtained using this technique [7], i.e., insulators, semiconductors, metals, and superconductors. More recently, large-scale data mining approaches established that from about 10 5 experimentally known crystals, approximately 1800 should be exfoliable and thermodynamically stable [8]. Around 40 of them belong to the group called transition metal dichalcogenides (TMDs). In this family, contrary to graphene, a single layer is built from three planes of atoms: two of them are built out of chalcogen X, surrounding a symmetrically metal M plane (X-M-X structure). Usually, they have a trigonal (1T), hexagonal (2H), or rhombohedral (3R) polymorphic structure [9]. Around half of them in bulk form are metallic, and the other half are semiconducting [10], sharing the common theme that when thinned down, the electronic band gap increases. In several cases, the transition from metal to semiconductor is predicted. The semiconducting TMDs that sparked the greatest attention due to their promising optoelectronic properties [11] were compounds with formulas MoS 2 , MoSe 2 , MoTe 2 , WS 2 , and WSe 2 .
What is worth noting, is that the properties of the monolayer crystals depend not only on their chemical composition and structural phase, but also on the choice of the substrate. When the first samples of TMDs were obtained, the standard procedure after their exfoliation was to place them on a properly prepared SiO 2 /Si substrate, which for a certain thickness of the SiO 2 layer enhanced the contrast in optical microscopy [12]. It was also understood that optical quality depends heavily on the surrounding dielectric environment [13,14], which posed the question as to whether SiO 2 is truly the best substrate for optical studies. Currently, it is clear that the substrate on which the subtle optical features of TMDs can be most easily observed is hBN [15,16], for which, e.g., exciton lines approach ∼2 meV widths.

Optical Properties of TMDs and Their Heterostructures
One of the results that sparked widespread attention was the observation that MoS 2 , when thinned down to a single layer, exhibits significantly stronger photoluminescence (PL) than n ≥ 2-layer crystals [17][18][19]. A similar trend was then confirmed in MoSe 2 [20], WS 2 [21], WSe 2 [20,21], and MoTe 2 [22,23] semiconductors with varying magnitudes of differences between samples and with different numbers of layers. This phenomenon was related to the indirect-direct band gap transition, caused by different renormalizations of indirect (Γ − Q) band gaps compared to the much smaller renormalizations of the K − K direct gap when samples are thinned down. This trend was confirmed theoretically by ab initio calculations [24,25] and experimentally by photoemission spectroscopy for valence bands [26].
The first inkling that strong PL is dominated by an excitonic (X) transition comes from the comparison of the PL peak energy position to the known spectral positions of the excitonic transitions in bulk MoS 2 at K point [17,27]. The generation of excitons in a given valley is possible via specific selection rules, which couple circularly polarized light with excitation in a given valley [28][29][30][31][32][33][34][35], (see Figure 1 for schematic picture). The valley coherence of such excitons has been probed in many experiments [36][37][38][39][40]. The spontaneous choice of valley after excitation with unpolarized light has also been observed [41], pointing towards a valley-polarized ground state [42][43][44]. Excited exciton states were measured in WSe 2 [45,46] and WS 2 [47] prior to molybdenum-based TMDs due to the larger separation between A and B excitons (∼0.4 eV), so that the B exciton was not "masking" the excited A exciton states. Complementary PL excitation spectroscopy allowed for the measurement of excited states in MoS 2 , MoSe 2 , WS 2 , and WSe 2 [48][49][50]. The excited B exciton series is more challenging to measure [46,49] due to overlap with the wide PL signal from socalled C excitons. All of those experiments confirmed that the spectrum of s-like states deviates strongly from the 2D hydrogen-like series. Further experiments in magnetic fields reached up to 5s excited states [51][52][53][54][55], addressing also the complicated problem of the diamagnetic shifts of exciton lines resulting from three contributions (see Figure 1), as discussed theoretically in one of our other works [56]. The exciton series consists not only of bright s-series but also of two types of dark states-spin-forbidden and excited excitons states with symmetry different from 's'. The former one can be probed by PL dynamics [57,58], different emission configurations [59][60][61][62][63][64], the different temperature dependence of emissions [65], or by using a tilted magnetic field [60,[66][67][68][69]. The latter ones are usually probed using two-photon spectroscopy [70,71]. Dark p states can also be studied in pump-probe experiments [72][73][74][75][76][77][78], where 1s excitons are generated and transition from 1s to 2p states, which can be measured by a probe terahertz beam [79]. Additionally, excitons with finite center-of-mass momentum, which require phonon-assisted excitation due to momentum conservation can also be probed, revealing the complicated landscape of so-called momentum-indirect excitons [80][81][82][83]. and hole Bloch wave functions. These methods, being conceptually and numerically traceable, allowed for many fascinating advances in understanding exciton properties in TMDs. For the case of the heterostructure, k·p models were also developed, allowing studies of intra-and interlayer exciton properties [77,[274][275][276][277][278][279]. Further simplification of the k·p methodology to the massive Dirac fermion model was used to analyze intra-and interlayer exciton properties [77] and the Bose-Einstein condensate of excitons [277,278,280].
Due to complicated numerical and conceptual challenges inside DFT + GW + BSE theory and the somehow approximate nature of the simplified methods mentioned above, one may turn to a tight-binding methodology, offering an optimal trade-off between numerical tractability and the ability to capture essential physical ingredients in the excitonic problem. Tight-binding studies [79,80,82,244,248,249,251,252,281] allowed, e.g., to track exciton fine-structure evolution dependence on electron doping, the exchange interaction, and magnetic and electric fields. Theoretical results on excitonic landscapes have been shown [83,247], together with results confirming the topological splitting of 2p states in the exciton spectrum due to the effect of the Berry's curvature. In our own work [281], as further described in more detail, we studied the role of band nesting, screening, and wave function topology on exciton fine structure.

Ab Initio Insight into Electronic Structure
In the following subsection, we review our understanding of TMD monolayers based on ab initio techniques that we used in several recent works [24,56,281,282]. Single-layer MX 2 crystals in a 2H phase are arranged in a trigonal prismatic structure. The unit cell consists of three atoms, one metal and two chalcogens, as shown in Figure 2. From the top, the arrangement of the atoms reminds one of graphene. From the side, one can notice that the atoms are actually organized in three planes, one (central) metal plane and two chalcogen planes shifted by ±d ⊥ . The distance between the metal atom and the central position between the two chalcogens is denoted by d . We define the primitive vectors of the real space lattice as a 1 = d (0, √ 3) and a 2 = d (3/2, − √ 3/2). The lattice constant a 0 = d M−M can be written as a 0 = d √ 3 and d ⊥ = d X−X /2. Following the commonly practiced ab initio procedure, described in detail elsewhere [282], the lattice constants calculated using the PBE exchange-correlation functional for MoS 2 are d = 1.8393 d ⊥ = 1.5622 Å. Reciprocal lattice vectors b satisfy the relation e i b· a = 1. This gives a 2D set of four equations b i · a j = 2πδ i,j for i, j ∈ 1, 2. The solution yields b 1 = 2π/d (1/3, 1/ √ 3) and b 2 = 2π/d (2/3, 0). We note that such choice of real and reciprocal space-primitive vectors gives the following position of the hexagonal Brillouin zone (BZ) corner at K 1 = (0, 4π/(3 √ 3d )), which is called the K point, just as in graphene. All other K points are obtained by the successive application of the C 6 rotation. We note that in half of the distance between the K points and the Γ point there are so-called Q points (alternatively Σ [251] or Λ [268] points). Halfway between the two nearest K points lies the M points. Now, we describe the general features of the DFT band structure, focusing on a representative example of MoS 2 . In Figure 3, one can observe 11 bands around the Fermi level (set to 0). The direct band gap at the K point without SOC is equal to ∆ GAP = 1.67 eV. We note that both the valence band (VB) and the conduction band (CB) have secondary extrema, at the Γ point in the VB and at the Q points in the CB. Next, we study the general properties of the DFT Kohn-Sham wave functions. First, we choose three spheres around one Mo and two S 2 atoms, and calculate how much of the wave function is localized inside each sphere. We note that 11 bands around the Fermi level have wave functions that are in general localized on both Mo and S 2 atoms; however, there is a clear trend that VB and bands above are mainly localized on Mo atoms, and VB-1 and lower bands on S 2 atoms. DFT wave functions inside spheres can be next projected onto Slater orbitals. We first confirm that the largest overlap is achieved, in case of MoS 2 , with localized Slater, single-zeta-basis-consistent [283,284] with orbitals with a principal quantum number n = 4 on Mo and n = 3 on S. For other compounds, e.g., WSe 2 , there is a clearer and larger numerical overlap with orbitals with quantum numbers n = 5 for W and n = 4 for Se, confirming that such projection is a reliable tool in studying symmetry and the orbital composition properties of DFT wave functions. In the next step, summarized in Figure 3, for each wave vector k and energy E, the wave functions are projected onto symmetric and anti-symmetric orbitals with respect to the metal plane. One can observe that VB and CB are symmetric across most of the Brillouin zone, with the exception of parts close to the Γ point in the CB. This ordering of bands and their symmetry is general for all MX 2 -family compounds. In the next step, we study the orbital-resolved decomposition of wave functions. The results confirm that orbital composition close to the K point in the VB bands is a combination of 4d +2 and 3p +1 orbitals. Contrary to VB, in the CB, the 4d 0 state coupled to the 3p −1 state dominates. The higher symmetric conduction band (CB + 1) (4d −2 ) is coupled to the 3p 0 state. Coupling between anti-symmetric d and p states is also obtained, in which the 4d −1 state is coupled to anti-symmetric 3p +1 and 4d +1 to both anti-symmetric 3p −1 and 3p 0 states.  Due to the heavy nature of the atoms in the MX 2 family, one can expect that a relativistic spin-orbit interaction may influence the band structure. Here, we discuss briefly the results of the PBE + SOC calculation for MoS 2 . The main effect of SOC is visible in the VB at K point, where bands are spin-split by ∆ SOC VB = 148 meV. It is worth mentioning that this is the smallest splitting in the VB at the K point in MX 2 . Significantly smaller splitting is observed in the CB at K point in MoS 2 (∆ SOC CB = 3 meV). This less-pronounced order-of-magnitude effect comes from the fact that the majority of (80%) d-like orbitals in the CB have the quantum number m d = 0, therefore splitting must come from the admixture of p-orbitals (10%) from chalcogen atoms. What is also important is that there is a significant spin splitting in the second minimum of the CB at Q point, which is larger than at K point, pushing the relative distance between the CB minimum at K at Q point (∆ SOC K−Q ) close to small values, making the CB almost degenerate, e.g., in WSe 2 . Let us also note that by symmetry there is no spin splitting along the Γ − M line in the BZ.

Minimal Tight-Binding Hamiltonian
In the following subsection, we review our construction of a tight-binding model for TMDs monolayers [56,282]. Building on our DFT analysis presented in the previous subsection, we conclude that there are several common features in the band structures of analyzed MX 2 semiconductors. They have a direct band-gap at K points, opposite to their N ≥ 2-layer form, which are indirect-gap semiconductors [24]. These materials also have secondary minima in the conduction band, localized close to the Q points, and secondary maximum in the valence band at the Γ point. Analyzing orbital compositions, we conclude that CB and VB can be described by orbitals, even with respect to the metal plane. This result is consistent with several other works [285][286][287][288][289][290][291][292][293][294][295]; therefore, we can start building a tight-binding model of those materials from the orbitals contributing most to the band structure around the Fermi level.
Our conclusion is that the minimal tight-binding model has to include orbitals that are even with respect to the inversion symmetry about the z plane of the metal atoms (z → −z). Motivated by ab initio results, at least m = 0, ±2 d-orbitals from metals-and m = 0, ±1 symmetric p-orbitals from the top and bottom sulfur atoms must be taken into account. We note only that for the d orbitals of metals the situation is clear, namely orbitals with a given L = 2 and m = −2, 0, +2 are centered around M atoms. However, the orbital construction for two chalcogens must be performed with care. Because we define the so-called dimer orbitals, which are centered around the same plane as metal atoms, we begin with upper (U, R U = (0, 0, +d ⊥ ) with respect to the dimer center) and lower (L, R L = (0, 0, −d ⊥ )) p-orbitals with quantum numbers L = 1, m = ±1 and define dimer orbitals ϕ as a proper combination of those two, as described in detailed elsewhere [56,282].
In the next step, we construct the standard linear combination of the atomic orbitals for all of the orbitals discussed above. First, we note that due to our dimer orbital construction, it makes sense now to talk about the sublattices A (metal positions, τ 1 = (0, 0, 0) inside the unit cell) and B (chalcogen dimer centers, τ 2 = (d , 0, 0)) on the real-space hexagonal lattice. Because we aim to understand the fundamental properties of those materials in terms of graphene physics, we focus on the sublattice A-sublattice B interaction. First, let us try to understand the physics of why the MX 2 materials considered are semiconductors instead of semimetals, such as graphene. To achieve that, let us analyze the tunneling matrix element between the central A atom with the potential V A ( r) and the three nearest neighbor B atoms at positions R B 1 , R B 2 , R B 3 . One can notice, that exactly at the K point, this formula gives where V pd are standard Slater-Koster integrals. For such combinations of m d and m p quantum numbers, that 1 + m p − m d = 0, ±3 tunneling matrix element is non-zero We note that all those couplings explain the couplings detected in DFT. In next step, we move to a discussion of the tight-binding Hamiltonian. The standard procedure outlined by Slater and Koster (SK) [296] is used. All details of this straightforward derivation are discussed elsewhere [56,282]. We note only that because in our basis we are using complex orbitals, SK formulas cannot be used directly and we had to implement them into a more complicated fashion, which is hidden in our notation behind functions V, W, depending on the SK parameters. The next-nearest neighbor Hamiltonian can be written in block form as: where the matrix describing the metal-metal next-nearest neighbor interactions is given by and the corresponding matrix describing the X 2 -X 2 dimer interactions is given by Finally, metal dimer tunneling is described by The functions f and g, depending on the wave vector k, combine the proper sum of the plane wave functions to the nearest and next-nearest neighbors. They are listed in Ref. [56]. As discussed in one of our works [56], we find that the next-nearest neighbor Hamiltonian is a minimal model that correctly describes the band gap across the whole BZ and is able to quantitatively reproduce the orbital compositions of the VB and CB. We note that the last column (except the diagonal element) in the Hamiltonian has a sign opposite to the one used in Ref. [56] due to two different conventions of complex spherical harmonics that could be used (Condon-Shortley phase). Here, complex and real orbitals are related by . After the derivation of our tight-biding Hamiltonian, we turn to the problem of fitting the SK parameters. Our goal is to obtain dispersion for even bands from TB as close as possible to even bands obtained and detected using the ab initio method, as described above. We note that in principle it is possible to calculate SK parameters directly from a first-principles calculation [297]; however, the resulting electronic dispersions are not satisfactory, neither for graphene [298], nor for MoS 2 [289], especially for unoccupied electronic states (CB and higher bands). Those parameters could be in principle treated as a starting point for our analysis; however, we take another route. The general problem with SK parameters is that even if we fix the structural parameters d ⊥ and d , we still end up with a 10-dimensional, highly non-linear optimization problem, depending on the energies We have tested various schemes to choose those parameters to fit the dispersion of such a model to the DFT [282]. At the end of the day, we used the one closest to the Monte Carlo philosophy of random probing of multidimensional parameter space.
The results of two fitting procedures are presented in Figure 4. We note that the fit presented in the left of Figure 4 was found by assuming equal weights for all k points and all even bands; therefore, it is called "best all bands". Additionally to reproducing the overall energies of even bands well, this parametrization reproduces VB very well. To obtain the best transition energy (right panel of Figure 4), a much more complicated procedure was used, first, increasing the weights for VB and CB on the entire Γ − M − K − Γ line, then, in subsequent sweeps, the weights around the K point in the VB and CB and the Q point in the CB were further increased. Thanks to those weighting procedures, both the VB and CB along the K − Γ line are reproduced very well. We note that the former reproduces the VB very well, while the latter one reproduces the CB very well, especially on the K − Γ line.
To include spin-orbit coupling in our tight-binding model, we analyze the matrix elements of the L · S operator. It turns out that only non-zero spin-orbit operator matrix elements are diagonal in our basis and read (for spin-up) The full Hamiltonian with SOC can be written therefore as: The parameters λ Mo and λ S 2 have to be chosen in order to reproduce the spin splitting of the bands. This choice in general changes the dispersion of the bands due to the modification of the diagonal parts of the Hamiltonian and, in principle, requires additional fitting. We checked that setting λ Mo = 0.148/2 eV and λ S 2 = 0.03/2 reproduces the correct splittings in the VB (0.148 eV) and CB (0.003 eV). We note that the order-of-magnitude-higher value of λ S 2 stems from a different contribution of m p = 0 orbitals to CB (order of 20%).
To better understand how SOC affects the band structure across the BZ, let us analyze spin-split bands along the (+K) − Γ − (−K) direction. Our choice of parameters reproduces spin splitting in the VB and CB at K points well and our model catches spin inversion between the +K and −K points. Interestingly, the general feature of the whole MX 2 family is the spin inversion of bands in the CB close to the K point, taking place between K and Q points. This feature is better visible when the lowest spin-split band is shown throughout the BZ, as shown in Figure 5a,b. For example, in the +K point, the bottom of the CB has a spin orientation the same as the top of the VB. However, the region of the BZ where this property holds is very small (red region around +K in Figure 5b), and quickly, the other spin becomes lower in energy. This situation changes again approximately half-way between the K and Q points, around which the spin is again oriented in the same way as at the +K point in the VB. Interestingly, in both Mo and W-based TMDs, the spin at Q point is always oriented the same way as in the VB at K point, irrespective of the spin ordering change in W-based materials. The same spin orientation between the VB at K and Q at the CB means that all momentum-indirect excitons with momentum |Q − K| will have a spin-"bright" configuration and, when activated, e.g., by phonons, they should be optically detectable, in contrast to the spin-forbidden lowest excitons that have a small momentum around K points in tungsten-based TMDs.
In the next step, we discuss the low-energy Hamiltonian around K point. We begin by noticing that at K point, the top of the VB is built almost solely from the m d = +2 orbital, while the bottom of the CB is built from a combination of m d = 0 and m p = −1 orbitals. However, assuming the low-energy basis as m d = 0 and m d = +2, and expanding g 0 and g 2 functions around K point ( k = K + q), one can immediately end up with a massive Dirac fermion (mDF) Hamiltonian [299]: The best parametrization for dispersion including up to 1/4 distance between the K and Q point we found is ∆ = 1.6848 eV, a = 3.193 Å, t = 1.4677 eV. The simple mDF model defined by Equation (8) gives very similar results of dispersion compared to models with further corrections, such as trigonal warping, and when applied to excitonic calculations there is no qualitative and very little quantitative difference. We note that mDF can be even further reduced to the parabolic (effective mass) model. This can be performed by noting that eigenergies of mDF are: Keeping only the first order of ε, we end up with The effective electron mass is given as m * = ∆h 2 /(2a 2 t 2 ). We note that the choice of parameters ∆, t, m * depends heavily on how large a portion of the BZ we want to fit as closely as possible. One can also note that the top of the VB and the bottom of the CB are  As a summary, in Figure 6, we present the ab initio result and different low-energy models of the VB and CB: the tight-binding, massive Dirac fermion, and parabolic models. One can observe that all of them describe well the neighborhood of K point; however, by only using the TB model it is possible to obtain a second minimum at the Q point in the CB and a correct second maximum of the VB at the Γ point. Approximately 10% of the K − Γ line from the K point is properly described by both mDF and parabolic models, with the mDF model being better for the CB description. We note also that both mDF and parabolic models can be extended to include spin; however, in both cases it is necessary to find a different parametrization for two spin species (e.g., two effective electron and hole masses and two gap parameters ∆ for the parabolic model). Such spin-dependent low-energy model parameters for MX 2 semiconductors are also available in the literature [300].

Bethe-Salpeter Equation
The simplest picture of light absorption by semiconducting material can be understood in terms of the transitions of carriers from the valence to the conduction band due to photon excitation with energy E exc. , changing the angular momentum by ±1 when circularly polarized light is used. In MX 2 semiconductors, the situation becomes more complicated, because dipole transitions between d orbitals (m d = ±2 in the VB and m d = 0 in the CB) and negligible p-d transitions cannot explain that circularly polarized light excites carriers within one valley. The solution to this problem comes from a realization that the velocity matrix elements inside the transition matrix elements in general have two contributions [301]: dipole transitions between localized orbitals and terms related to electron hopping between lattice sites. This hopping contribution sets the phase of the velocity matrix element between the VB and CB Bloch wave functions and defines the optical selection rules in the gapped "chiral" fermion systems [302,303].
Because it is possible to selectively excite carriers from the band edge of one valley (e.g., +K), in theoretical investigations it is useful to think about two non-equivalent parts of the hexagonal BZ. A unique association of k points that belong to one or the other valley has to be performed, as shown in Figure 7. Starting from cutting "wedges" around three equivalent (related to each other by reciprocal lattice vector G translations) K points, as shown in Figure 7a, one can move respective wedges to one neighborhood of the K point, creating a triangle around it, as in Figure 7b. We note that a similarly constructed triangle for the −K valley has to be rotated by C 3 symmetry, and both triangles for +K and −K valleys placed next to each other create a rhomboidal BZ equivalent to a full hexagonal BZ.
Exciton properties can be calculated from the configuration-interaction approach to interacting electrons [304,305], particularly useful in studies of nanostructures, e.g., quantum dots [306][307][308][309][310][311][312][313][314][315][316][317][318][319][320][321][322] or electron gas in quantum wells in strong magnetic fields [323,324]. This approach has only recently been utilized to study excitons in reciprocal space [230,281].  Figure 8a. We note that the complete derivation is presented in detail in Refs. [281,282]. We assume that we have only one valence and one conduction band. Next, to construct the ground state, we fill all states in the valence band, as shown in Figure 8b. Then, the exciton state is formed as a linear combination of excitations with coefficients A Q CM n being complex electron-hole amplitudes. Exciton states can be calculated using the standard eigenvalue problemĤ X |X n = E n |X n , whereĤ X is an interacting excitonic Hamiltonian. After calculating the matrix elements of this Hamiltonian using the Wick theorem, neglecting ground state energy correction due to interactions and incorporating electron and hole self-energies into dispersion energies ε c,k and ε v,k , we obtain the Bethe-Salpeter equation for the exciton (for center-of-mass momentum Q CM = 0) In this equation, the summation over the k states is understood as over all vectors in the first BZ, the same as the number of unit cells in the crystal. Electron-hole interaction matrix elements are generally defined as: We note that in our Equation (11), we expand the exciton wave function only in electron-hole pair excitations from the VB to CB and not reverse (just as Equation (16) in Ref. [200]). Such an expansion is called the Tamm-Dancoff approximation. If the interaction matrix elements are not screened, Equation (11) is called time-dependent Hartree-Fock. In our work, we always used screened versions of matrix elements (see discussion below); therefore, in our case, formally, we should be using the name simplified Bethe-Salpeter (simplified in the sense of using an approximate, not self-consistent, screening). In our approach, based on the configuration interaction picture, it is natural to use the same screening for both direct and exchange electron-hole terms, producing results that are consistent with other similar works [230]. We note that in general the issue of the screening of the exchange electron-hole interaction is a long-standing problem, with several works discussing its various aspects (see, e.g., references in Ref. [230]). It is understood from a formal point of view that the iterative solution for the two-particle Green's function G = G 0 + G 0 · K · G (G 0 is the non-interacting Green's function describing the independent propagation of electron and hole, K is interaction kernel) does not allow the screening of the exchange electron-hole part of K, as discussed in Ref. [325]. However, there are reasons to screen it just as one direct stemming, e.g., from the finite Hilbert space of the problem when the Bethe-Salpeter equation is solved in practice. The screening of exchange in DFT + GW + Bethe-Salpeter has recently been discussed in the context of excitons in MoS 2 monolayers in Refs. [326,327]. In an alternative approach to this problem, in paper [328], some of us expanded the systematically excited states of a graphene quantum dot in multi-electron-hole pair excitations. Figure 7 shows the evolution of ground and excited states as a function of the number of excited pairs. Figure 9 shows the singlet-triplet splitting. The main result is obtained by including an extra two pairs which do the screening, i.e., produce an effective interaction for singlet and triplet excitations.

Coulomb Matrix Elements
Let us identify the type of matrix elements within the summation over k in Equation (11). Note that they are written in electron-only language. As shown in Figure 9, the first element (with the minus sign) describes the process in which electrons in conduction bands in state k and electrons in valence bands in state k scatter via the Coulomb interaction to electrons in state k in conduction bands and electrons in state k in valence bands. This description is equivalent to the electron-hole pair scattering from state k to k . The second process in Equation (11) describes electrons starting as previously in the k state in the conduction band and the k state in the valence band and scattering to the same k and k , but changing band indices to valence and conduction, respectively. We identify the first process as an attractive direct electron-hole interaction, and the second process as a repulsive-exchange electron-hole interaction.
Next, we analyze direct electron-hole Coulomb matrix elements. Substituting the Bloch form of the wave functions and utilizing the fact that functions of coordinates in plane can be analyzed in reciprocal space (interaction and co-densities), we use their Fourier components as a general strategy to re-group six-dimensional integrals. The final expression for the direct matrix element (with coefficient S/(2π) 2 resulting from sum to integral transition) is where γ = e 2 / 8π 2 ε 0 and interaction form factor F D is given by: Pair densities can be evaluated numerically for every coordinate z by using the explicit form of the Bloch wave functions, constructed using localized Slater orbitals ϕ, as described before, as We note that further discussions of the details of the properties of the direct Coulomb interaction, e.g., the behavior of pair densities, convergence issues, matrix elements symmetries, complex phase properties, and descriptions of their inter-valley behavior, are described elsewhere [281,282]. Analogously, the exchange Coulomb matrix elements are expressed by with form factors There are several differences between direct and exchange electron-hole interactions. As can be seen from Equation (11), direct interaction comes with a negative sing (electronhole attraction) and an exchange one comes with a positive sign (electron-hole repulsion). Contrary to direct matrix elements, there is no 1/|k − k | overall dependence of magnitude of those elements; therefore, potentially exchange interaction can be large whenever direct matrix elements are small due to the large |k − k | distance. We note also that interaction form factors F D consist of a pair of densities diagonal in band indices (ρ vv/cc ) and offdiagonal with wave vector indices (ρ kk ), while for exchange-interaction form factors, the situation is opposite, i.e., there is off-diagonal dependence on band indices (ρ vc/cv ) and diagonal wave vector dependence (ρ kk/kk ). Due to those properties, we found that direct matrix elements are generally complex numbers, while exchange matrix elements are real. Additionally, due to dependence only on the diagonal wave vector, the exchange matrix element can be computed much faster than the direct matrix elements. At this point, we note also that at first glance there is G = 0 singularity in V X . This singularity is generally problematic for 3D crystals and different methods of dealing with it have been discussed in the literature [329]. However, for a 2D crystal, the singular term is equal to 0 and can be excluded from the summation over G vectors in Equation (16). We note that the Bethe-Salpeter equation defined in Equation (11) is gauge-invariant, i.e., it does not depend on the gauge choice of wave functions, which can be created arbitrarily for every band and k point. On the other hand, the numerical phase affects the phase of the exciton coefficients. This arbitrariness of phase choice may influence the apparent symmetry of the excitonic ground state [253]. One of the solutions to this problem is to study only observable quantities, such as the imaginary part of the dielectric function ε 2 , in which complex exciton amplitudes are multiplied by velocity operator matrix elements, which are also constructed from wave functions. Keeping track of the phase in both should be enough to obtain a gauge-independent answer. However, sometimes one is interested in studying excitonic wave functions themselves. To obtain ground excitonic states with 1s symmetry, various numerical approaches are used; however, their details are rarely discussed in the literature [200,253]. In our procedure, we follow the idea introduced by Rohlfing and Louie [200], in which the global phase of tight-binding wave functions is chosen in such a way that the sum of imaginary parts is 0, i.e., ∑ 2 α=1 ∑ 3 µ=1 Imv (n) αµ = 0. In the next step, we rotate the global phase to obtain phase 0 in the m d = 0 orbital, which means that the phase of the second TB coefficient is set to zero (Im v 12 = 0). The second step breaks the first property, although we found that it is necessary to numerically obtain excitons in the +K and −K valleys, the wave functions of which have A n (−k) = A * n (k) properties, as expected from time-reversal symmetry arguments.
Just as for the exchange Coulomb interaction discussed above, it is easy to understand that the diagonal term of the direct electron-hole interaction in BSE given by Equation (11) is singular at k = k and G = 0. Renormalization due to this singularity has to be included in simulations on the finite lattice, otherwise numerical results give vastly different results when compared with theoretical predictions. Dealing with this singularity is connected with the discretization of the BZ associated with a single valley. Below, we discuss the rectangular discretization of the lattice. Confining our discussion to the case of G = 0 for a moment, one of the methods that allow to integrate out the singularity is to note that form factor F D (k = k , G = 0) = 1 and make an approximation that the exciton wave function inside the box centered around the point (k x ,k y ) takes a constant value. This approximation is naturally more and more exact with an increasing number of points (and a decreasing area associated with each k point), into which the BZ is discretized. This allows us to approximate the diagonal of the BSE interaction kernel as k x +δk/2 k x −δk/2 k y +δk/2 We note that this result assumes a constant, static screening of the electron-hole interaction and in principle should be recalculated when the k-dependent screening model is used. Additionally, the G = 0 term is the leading one; however, summation over G vectors introduces further corrections into the singular term. We checked numerically that both effects introduce contributions that are at least two orders of magnitude smaller than the expression given in Equation (18) and are neglected in further studies. We note that the constant V sin. ≈ 3.5255 is calculated for the rectangular lattice with an area of BZ around the single point given by (δk) 2 , and it should be changed to V sin. ≈ 3.2325 for a rhomboidal lattice with an analogous rhombus area given by √ 3(δk) 2 /2 under the same approximations as for the rectangular discretization scheme.
In the next step, we build a systematic theory of approximation for the direct electronhole matrix elements. The calculation of the interaction form factors F D is a major bottleneck of both ab initio and TB calculations due to the necessity of calculating them for all combinations of k and k , and the additional summation over reciprocal lattice vectors G. The easiest solution is to assume all form factors to be one (their highest possible value, exact for k = k and G = 0) and note that the highest value-entering sum over G vectors comes from Gs minimizing |k − k − G| distance. We checked that this approximation is useful around the K point, for which the neighborhood-form factor absolute value deviates from one rather slowly. The approximation described above allows us to reproduce numerically analytical solutions for a simple model of the exciton with an electron/hole dispersion in a parabolic approximation. Its main deficiency is that it does not include any effects related to the orbital composition of the bands (no Bloch function effect) and is purely real, which always gives degenerate states in the exciton spectrum with the same exciton angular momentum quantum numbers. To motivate a way around this problem, let us first discuss which parts of the full, complex form factor influence mainly its value. For concreteness, let us discuss, for example, the difference between a form factor describing the scattering between one chosen k point to some k + ∆k point (∆k is assumed to be small). One can quickly deduce that the effect of ∆ k on the matrix element V D is complicated: it affects (1) the denominator 1/|k − k − G| ; (2) tight-binding coefficients inside ρ vv/cc ; (3) exponent values depending on z, z ; and (4) details of the in-plane integration of Slater orbitals and the summation over unit cells. We checked numerically, implementing in code the selective turning on/off of all the above contributions so that, actually, the first two corrections (denominator and TB coefficient) yield a very good approximation to the matrix elements and the last two (exponent and details of integration) do not contribute too much. This motivated us to extract TB coefficients from form factors, giving an expression formally equivalent to Equation (14) with an implicit summation over sublattices (α, β) and orbitals (µ, ν), which can be written as and the quantity we call "orbital form factor" F D αβ... is given by Those orbital form factors depend on analogues of product densities, now in the form that does not depend on TB coefficients. This method, being equivalent to Equation (14), is not faster; however, it helps to realize that the C TB coefficient can be calculated very fast from the TB model and we can take only the orbital form factor at the k − k − G = 0 limit. This approximation is conceptually equivalent to treating the long-range Coulomb interaction as not dependent on the details of the orbital structure and taking the maximal absolute value of the orbital form factor (=1). Interestingly, taking this limit also produces a phase φ = G · (−τ α + τ α ), related to non-zero G and the position of atoms inside unit cell: The final equation for the direct form factor is, therefore, simplified to This expression is an extension of results presented in Refs. [245,247,250,330] to include summation over non-zero G vectors. We note that the phase rotation presented in Equation (22) is either 1 or C 3 rotation equal to exp(±2πi/3). As is known from the literature and from our own experience, the computation of the excitonic spectrum is a numerically challenging task. To lower computational complexity even further, let us discuss now how excitonic fine-structure calculations are performed in practice. First, we discretize Equation (11), neglecting the electron-hole exchange interaction, since it is much weaker than the direct electron-hole interaction. Additionally, we choose in summation over k wave vectors only those associated with one valley, as shown in Figure 7. Remembering about singular terms as discussed previously, the Bethe-Salpeter-like equation for one valley takes the form This equation represents a dense, Hermitian matrix that is diagonalized numerically. The primary convergence parameter is the number of k vectors into which a single valley was discretized. Details of computations and convergence studies are presented in our works [281,282]. A further step in fine-structure calculations is to add spin splitting to both valence and conduction bands. Spin-splitting modifies the electron-hole energy difference ∆E k = ε σ CB k − ε σ VB k . To calculate matrix elements, we choose to use spinless wave functions due to their negligible dependence on spin. By this method, we are able to obtain a fine structure in one valley, i.e., both bright and dark A and B excitonic series. The calculation of fine structure in one valley automatically gives fine structure in the other one due to the symmetry of energies E n (+K) = E n (−K) for spin-flipped configurations of excitons. We also found that with a proper gauge of matrix elements, the V − k, − k = V k , k property of the matrix elements between valleys is satisfied. Implementing this symmetry in BSE, one can formally prove that exciton wave functions have to be related to each other as A * n − k = A n k . We checked numerically, performing separate, full calculations in +K and −K valleys that our implementation satisfies these properties, giving within numerical precision the same excitonic spectrum and phases of excitonic states related by the mirror symmetry of k vector and complex conjugation.

Screening of Coulomb Interactions
An additional complication in the realistic description of the Coulomb electron-hole interaction stems from the screening of the interaction by other carriers. As noted early in studies of graphene [331], electron-electron interaction screening in 2D crystals behaves differently than in 3D crystals. Therefore, it is necessary to use a more involved screening model. The most well known model that captures the major linear dependence of screening in k-space was derived in the physics of thin dielectric slabs and is called the Rytova-Keldysh (R.-K.) model [272,273,331]. This approximation has been used in numerous works [53,260,265,[269][270][271]331,332]. In this theory, the bare Coulomb matrix elements V D/X are divided by a dielectric function Counterintuitively, the static screening part ε r depends not on the material itself, but on the surrounding material's dielectric properties ε R-K r = (ε 1 + ε 3 )/2. This expression comes from the theory of a dielectric slab with a finite width d surrounded by two semi-infinite dielectrics with electric permeability ε 1 , ε 3 . In the numerical results for MX 2 discussed in the next sections, we consider two specific cases of the dielectric environment. In the first step, we analyze MoS 2 (with "effective" thickness d = 6.14 Å) on top of the bulk SiO 2 (ε 3 = 4) crystal [97,333], see Figure 10a for reference. We assume that from the top, the monolayer is surrounded by a vacuum; therefore, we take ε 3 = 1. Next, we consider the case of both MoS 2 and MoSe 2 encapsulated by hBN (as shown in Figure 10b), where we take ε 1 = ε 3 = 4.5 [54]. As is known from the literature [334], more advanced models of screening do not significantly affect exciton spectra. This is related to the property that the R.-K. model describes well-screening from ab initio and only for large k-space distance (when matrix elements are small due to 1/|k − k | dependence) there is substantial difference between "correct" and approximated screening models.
The realistic description of the Coulomb electron-hole interaction becomes even more complicated in the case of heterostructures [278,280,[335][336][337]. Furthermore, the complex form of the system leads to different formulas of dielectric functions for intra-and interlayer electron-electron interactions [335]. The Coulomb interaction potential for MX 2 bi-layers can be written in the form analogous to Equation (24) as Following Ref. [335], analysing the result of Poisson's equation in the limit of qd << 1, we can rewrite ε l i ,l j as ε l 1 ,l 1 (q) ≈ ε l 2 ,l 2 (q) = ε R-K and In the above, R.-K. screening formulas α intra R-K and α inter R-K can be treated as an effective polarizability estimated from experimentally known 1s-state energies of both intra-and interlayer excitons. One can expect that the interlayer screening model leads to the renormalization of the exciton spectrum, similarly to the monolayer case [281]. Interestingly, the interlayer R.-K. model gives a more precise dielectric screening description compared to the monolayer case [335]. We note that when we discuss the numerical results for the MoSe 2 /WSe 2 heterostructure encapsulated by hBN, we use ε 1 = ε 3 = 4.5 [54].

Mechanisms of the Renormalization of the X Spectrum
In the following section, several mechanisms affecting excitonic fine structures are briefly discussed, with in-depth discussion already presented elsewhere [281]. Importantly, we note that our main motivation to construct the presented theory was understanding the effects of the multivalley (K and Q points) band structure of MX 2 and the topology of the wave functions on the excitonic fine structure. Instead of focusing on agreement with ab initio + GW + Bethe-Salpeter, we note that our approach offers greater tunability, allowing to understand various physical mechanisms (e.g., role of dispersion, effective mass, strength of interactions, topology of wave functions, and screening mechanisms, etc.) contributing to the complicated problem of the excitonic spectrum. The understanding of these aspects is necessary to build a theory of optical response, e.g., in quantum dots, twisted bi-layer systems, or in the presence of magnetic fields, problems that are usually not easy to solve in the ab initio + GW + Bethe-Salpeter framework. We start with the effect of the band structure on excitonic levels. How different approximations to conduction and valence band energies affect the electron-hole pair energy is summarized in Figure 6. Direct transition energies enter as diagonal to the Bethe-Salpeter equation. In the parabolic-effective mass approximation with simplified bare Coulomb interactions and static screening, the excitonic spectrum gives the binding energy of the first 1s state equal to −4 Ry µ and the degenerate second shell of three states: 2s, 2p x , 2p y , whose energies are E n=2−4 = −4/9 Ry µ . When we change the dispersion model to a massive Dirac fermion, keeping interaction and screening as previously, we notice that the binding energy of the 1s state lowers to ≈−5.5 Ry µ . This behavior can be understood as an increased "average" effective mass, i.e., carriers in the massive Dirac fermion model on average taken over the BZ are described better by a higher effective mass than in the parabolic model. This naturally leads to stronger binding, corresponding to a lower energy of the 1s state. We also observe a larger separation between the first and second shells and a small breaking of degeneracy between 2s and 2p x , 2p y states within the second shell. Finally, when a tight-binding dispersion model is used, the effects described for the massive DF model become even more pronounced. The energy of the 1s state lowers as much as to −10 Ry µ , there is large renormalization of the 1s-2s states' energy differences, and states with exciton angular momentum L = 0 (2s) and |L| = 1 are no longer degenerate. A similar effect is observed for higher shells. In addition to the "average lowering" of the effective mass process, there is a pronounced contribution coming from the existence of Q points that can be observed in excitonic wave functions. The breaking of the degeneracy of the 3s and 4s state for L = 0 is also observed, together with the breaking of the degeneracy of states with different L. We conclude that due to existence of three Q points around K point in the single valley, the full rotational symmetry of s-like states is broken and therefore those states react more strongly than others to changes in electron-hole energy dispersion from the parabolic to the TB.
As discussed in the previous section, constant screening does not faithfully describe the actual screening of interactions in 2D semiconductors. The simplest correction, capturing the non-locality of screening via its |q − q | dependence can be modeled by Rytova-Keldysh theory and parametrized by polarizability α. The excitonic spectrum is heavily renormalized for TB model dispersion and the 1/|q − q | interaction when static screening is switched from the homogeneous one to the R.-K. model. Non-local screening has an opposite effect than changing dispersion from parabolic to TB, and 1s state energy rises from −10 Ry µ back to approximately −4 Ry µ . Additionally, split L = 0 and L = 0 excited exciton states drastically change their energies, reversing the order of arrangement of excitonic states.
In addition to the effects of dispersion and screening that renormalizes the 2D exciton spectrum, the direct electron-hole interaction form factors F D have to be taken into account. Focusing on complex values of the form factors, the result of a calculation on a 7000 k-point grid (the largest we were able to study with the full effect of wave functions) that allows for a reliable discussion of the first and second excitonic shell is presented in Figure 11. Generally speaking, because its value is ≤1, its averaged effect translates into lowering the value of binding energy and pushing excitonic shells toward each other, as shown in Figure 11a. Due to the strong effect of tight-binding wave functions, it is understood that with realistic form factors, different parameters of screening have to be chosen, for example, as shown for static screening in Figure 11b. We conclude that the collective effect of the renormalization of the 2D Rydberg series by dispersion, non-local interaction screening, and carrier wave functions is to make s-like series look like a "more than 3D" exciton. It means that even though the exciton is physically confined to the 2D plane of MX 2 , its excited state series resembles more the 3D excitonic Rydberg ladder of states. We note that non-hydrogenic Rydberg series is usually explained as an effect of non-local screening [47]. We find that it is not only related to screening, but also depends heavily on dispersion, especially secondary minima in the CB at Q points, and wave function contributions affecting the electron-hole interaction. As mentioned before, taking into account electron and hole wave functions affects the exciton fine structure by contributing to the renormalization of s-like states. In addition to that, another interesting effect occurs. Because the direct electron-hole interaction depends on wave functions, its value is in general complex and phases of matrix elements V D ( k, k ) cannot be set to zero by any gauge transformation. The role of a complex interaction phase manifests itself in two effects: first, it slightly renormalizes positions of s-like states, however, this effect is small. On the other hand, a clearly visible effect is that in the second shell, normally degenerate p-like states become split in energy and mix, forming 2p ±1 = 2p x ± i2p y states. More generally, due to complex interactions, all states with a non-zero exciton momentum L also mix and split (e.g., 3p ± , 3d ± , etc.).
The splitting of states with non-zero L reminds of the situation when a magnetic field is applied. In our case there is no magnetic field but instead there is a "geometric" field resulting from the topology of wave functions. This orbital moment, described by the gauge invariant Berry connection, has to come from the properties of wave functions. How to trace this Berry's connection effect on the electron-hole interaction has been shown elsewhere [282,338,339]. Now, let us focus for a moment on the effect of spin splitting on the exciton fine structure. The largest effect, related to the splitting of bands in the VB, is A-B exciton splitting. For ∆ SOC VB = 148 meV we obtain A-B 1s excitonic states split by ≈125 meV for α = 0.5 and the full TB model. A much more subtle effect, introducing the splitting of the A exciton state to spin bright (same spin arrangement in the VB and CB) and spin dark (opposite spins) is related to interplay between spin splitting in the CB and the different dispersion of bands, which can be understood approximately as different effective masses for carriers with different spins. Even though the single-particle arrangements of bands point to a "bright smaller than dark" arrangement, different effective masses cause an inversion of excitonic states and the spin-dark state becomes a ground excitonic state. We conclude that this situation happens in MoS 2 , which has very small ∆ SOC CB ≈ 3 meV. This conclusion is consistent with recent GW-BSE calculations [219,230] and some experiments [67].
Up to this point, we neglected generally small electron-hole exchange interactions. This interaction, however, controls the splitting of dark and bright states, because it affects only excitons built from an electron and hole with the same spin. It increases the energy of the exciton as a result of quantum mechanical electron-hole repulsion, in opposite to direct interaction. Starting with spin-degenerate states to make our analysis more transparent, when no exchange interaction V X is present, dark and bright states are degenerate. When an unscreened electron-hole interaction is turned on, the bright-state energy increases (the binding energy is lowered) up to ≈20 meV, depending on the choice and details of the screening model. When homogenous screening or R.-K. models are applied also to exchange interactions, this value is significantly lowered. We conclude that in all cases, the trend is such that the dark excitonic state in MoS 2 has the lowest energy, adding up to a similar conclusion from the spin splitting discussion above. We note that in the DFT + GW + Bethe-Salpeter approach [219], the electron-hole exchange interaction is always taken as the bare, unscreened one. In our approach, derived from the CI approach, both direct and exchange interactions are treated on an equal footing and should be screened in the same way. This issue has been discussed in the literature [230,325], but in our view further studies are necessary to understand the source of this discrepancy. On the other hand, irrespective to details of screening, we conclude that in MoS 2 , the lowest in the energy excitonic state should be dark due to spin and exchange effects.
In next step, we analyze the excitonic spectrum of the MoS 2 monolayer using the combination of the minimal tight-binding model, Bethe-Salpeter equation, and simplified Coulomb interactions theory, with form factors given by Equation (22) and the Rytova-Keldysh screening model. The exciton fine structure has been determined in two cases, namely the MoS 2 monolayer on the SiO 2 substrate and the hBN encapsulated MoS 2 , respectively. We restrict ourselves to the first three shells (n = 1, 2, 3). For MoS 2 on the SiO 2 and MoS 2 encapsulated in hBN we set the α Keldysh polarizability to correctly determine the experimentally known exciton ground state [54,247]. The 1s-state has been taken as E SiO 2 n=1 = −0.335 eV and E hBN n=1 = −0.231 eV, so that we obtained α Keldysh = 2.0 for the SiO 2 substrate and α Keldysh = 0.75 for the hBN surrounding. Figure 12 shows the excitonic spectrum determined for the MoS 2 monolayer. We restricted ourselves to n = 1, 2, 3 (first three shells) only. For both types of surroundings, the 2s state is characterized by a smaller binding energy than the 2p states. Moreover, we deal with generic topological effects manifested in the exciton spectrum resulting from the properties of electron and hole Bloch wave functions, especially the Berry curvature. Inclusion of the effect of wave functions on the exciton spectrum results in the topological splitting of 2p, 3p, and 3d states.
The excitonic spectrum for the MoS 2 monolayer on the SiO 2 substrate, shown in Figure 12a, is characterized by the exciton binding energy E b = −335 meV. The energy splitting between the 1s and 2s states has been determined to be E 1s−2s = 226 meV. In the hBN-encapsulated MoS 2 case presented in Figure 12b, both the excitonic binding energy E b = −223 meV and the determined 1s-2s energy splitting E 1s−2s = 176 meV are smaller than for the MoS 2 on SiO 2 .
We now turn to the effect of the second atomic layer on the exciton spectrum. Using the combination of effective mass approximation and the Bethe-Salpeter equation, including the interlayer Rytova-Keldysh screening [335], we now analyze the interlayer exciton spectrum for MoSe 2 /WSe 2 encapsulated in hBN [340]. For various spin-split combinations of bands, we study the fine structure for excitons with zero total momentum Q X = 0 (at K point). Excitonic states built from electrons and holes from distinct layers present a rich spectrum of different types of optical transitions. In Figure 13 we consider interlayer A/B/Ã/B exciton series of MoSe 2 /WSe 2 , where A denotes the transition between WSe 2 and MoSe 2 spin-up, B transition MoSe 2 -WSe 2 spin-down,Ã transition MoSe 2 -WSe 2 spin-up, andB transition WSe 2 -MoSe 2 spin-down, respectively. We restrict ourselves to 1s, 2p, and 2s states. The parameter α inter RK was set in order to correctly determine the experimentally known 1s-state interlayer exciton energies [142]. We take the ground state binding energy of the interlayer exciton as E B n=1 = −150 meV [142] and obtain α inter RK = 2.30. In Figure 13, we show the excitonic spectrum of the MoSe 2 /WSe 2 heterostructure, restricted to 1s, 2p, and 2s states only. Considering all interlayer exciton types, namely A/B/Ã/B excitons, we can analyze both the A-B /Ã-B and 1s-2s splittings. Interestingly, the values for all interlayer 1s-2s splittings have been determined as ∆(1s − 2s) ≈ 10 meV, while ∆(A − B) ≈ 50 meV and ∆(Ã −B) ≈ 30 meV. Including in our studies various spin-split combinations of bands, considering not only the energetically lowest optical transitions (A excitons), we predict rich interlayer exciton series resulting with additional peaks due to the presence of all A/B/Ã/B exciton types. More work is needed to understand intra-and interlayer excitons in 2D heterostructures.

Summary
In summary, in this review we describe the excitonic problem in transition metal dichalcogenide semiconductors. We discussed building an ab inito-based tight-binding model that captures all the important features of the electronic structures of these materials. Then, we describe our theory of exciton, focusing on issues related to the evaluation of the interaction matrix elements. Then, we presented how such a theory can be used to understand the physics of the excitonic spectrum in both mono-and bi-layer heterostructures. We discussed the effect of band nesting on the exciton fine structure; Coulomb interactions; and the topology of wave functions, screening, and the dielectric environment. Finally, we followed by adding another layer and discussed excitons in heterostructures built from two-dimensional semiconductors. We hope that this review will be helpful to people who are interested in entering the fascinating field of the optical properties of low-dimensional semiconductors.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: