The Impact of Stellar Surface Magnetoconvection and Oscillations on the Detection of Temperate, Earth-Mass Planets Around Sun-Like Stars

Detecting and confirming terrestrial planets is incredibly difficult due to their tiny size and mass relative to Sun-like host stars. However, recent instrumental advancements are making the detection of Earth-like exoplanets technologically feasible. For example, Kepler and TESS photometric precision means we can identify Earth-sized candidates (and PLATO in the future will add many long-period candidates to the list), while spectrographs such as ESPRESSO and EXPRES (with an aimed radial velocity precision [RV] near 10 cm/s) mean we will soon reach the instrumental precision required to confirm Earth-mass planets in the habitable zones of Sun-like stars. However, many astrophysical phenomena on the surfaces of these host stars can imprint signatures on the stellar absorption lines used to detect the Doppler wobble induced by planetary companions. The result is stellar-induced spurious RV shifts that can mask or mimic planet signals. This review provides a brief overview of how stellar surface magnetoconvection and oscillations can impact low-mass planet confirmation and the best-tested strategies to overcome this astrophysical noise. These noise reduction strategies originate from a combination of empirical motivation and a theoretical understanding of the underlying physics. The most recent predications indicate that stellar oscillations for Sun-like stars may be averaged out with tailored exposure times, while granulation may need to be disentangled by inspecting its imprint on the stellar line profile shapes. Overall, the literature suggests that Earth-analog detection should be possible, with the correct observing strategy and sufficient data collection.


Introduction
The confirmation and detailed characterization of exoplanets necessitates a measurement of the planetary mass. At present, the most tried and tested technique to obtain such a mass measurement comes from analyzing the planetary-induced Doppler wobble of the host star (arising because the pair share a common center of mass). For a true Earth-analog, the planet-induced radial velocity (RV) on the host star is a minuscule ∼9 cm s −1 -making this feat extremely challenging; hence, why such a confirmation still eludes us to date. However, recent advancements in instrumentation promise to finally make this challenge technologically feasible. For example, both the ESPRESSO [1,2] and EXPRES [3,4] spectrographs have recently had first light, with ESO open time on ESPRESSO commencing autumn 2018 (ESPRESSO: Echelle SPectrograph for Rocky Exoplanets and Stable Spectroscopic Observations; EXPRES: EXtreme PREcision Spectrometer). ESPRESSO is aiming to achieve ∼10 cm s −1 precision for (bright) targets in the Southern Hemisphere, while EXPRES hopes to approach this regime for similar targets in the North. However, trying to determine the RV of the host star can often be difficult due to inhomogeneities on the stellar surfaces themselves. For stabilized spectrographs, these RVs are determined by observing the stellar absorption lines in a relatively large passband, usually several thousand angstroms wide, and cross-correlating them with a template mask. The presence of a planetary companion induces wholesale Doppler shifts of the individual lines, where the net influence can be determined from the center of the cross-correlation function (CCF) [5]. However, if the stellar lines change shape this will also be reflected in the CCF. For instance, a dark starspot (or bright plage/facular region) creates an emission bump (absorption dip) in the stellar lines, and this asymmetry alters the center-of-light for both the individual line profiles and the overall CCF; as a result, spurious RV shifts are measured and can be confused for or mask the induced shifts from a planetary companion [6]. For a magnetically active star, the spots or plage/faculae can induce RV shifts on the 1-100 m s −1 level [6]; hence, the very active stars are often avoided when trying to search for Earth-analog companions. Nonetheless, even very magnetically quiet stars will still suffer anomalies from inhomogeneities arising from both stellar surface magnetoconvection and oscillations on the 0.1-1 m s −1 level [7]. Consequently, it is essential that we also understand these low-amplitude signals, as they are still large enough to completely swamp the signal from an Earth-twin and other low-mass, long-period planetary companions. There are numerous works dedicated to disentangling and removing the larger amplitude astrophysical signals from high precision RVs (e.g., see [6,[8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26], and references therein). The focus of this paper is to briefly discuss the physics driving convection and oscillation-induced RV variability (Section 2), review the recent works in the literature that aim to reduce these stellar noise sources in high precision RVs (Section 3), and comment on the prospects for the future confirmation of habitable worlds (Section 4). Convection creates time-variable asymmetries in the stellar absorption lines due to a combination of upward flowing, bright, hot, blueshifted bubbles of plasma (∼1 Mm in diameter) known as granules that eventually cool, darken, and fall back down, redshifting, into the surrounding regions known as intergranular lanes; since the granules are brighter and cover more surface area, they do not completely cancel out the intergranular lane contribution, which acts to depress the redward wing of the combined line profile-see Figure 1. Moreover, the larger contribution from the granules means the center-of-light for most absorption lines in a Sun-like star have an overall net blueshift (for the Sun this value is near 350 m s −1 [27]). Magnetic fields can inhibit the convection, and over a magnetic activity cycle the net convective blueshift varies on the ∼10 m s −1 level [17,20,24]. See [28,29] for how this varies with spectral type, age, and magnetic activity, and see [20,26,30] for how we may correct for this type of large-amplitude stellar noise; this review will focus on the shorter timescale variability.

Stellar Surface Magnetoconvection and Oscillations
As individual granules evolve, with lifetimes of ∼5-6 min on the Sun, the ratio of granules to intergranular lane is constantly changing, which means the stellar line asymmetries, and therefore net RV shift, are also constantly changing. Typical velocities for individual plasma flows are 1-4 km s −1 , but over the stellar disc much of the upflows and downflows cancel out, leaving net variability on the 0.1-1 m s −1 level for Sun-like stars [7]. The presence of granulation on a spherical host star makes the surface corrugated; an apt analogy is to think of the granules as 'hills' and the intergranular lanes as 'valleys' [31]-standing at the top of a hill looking out to other hills provides a vastly different vantage point than a birds-eye view from above. Near the limb it is impossible to see to the very bottoms of the intergranular lanes, granular walls become visible and only the near edge of their tops can be seen, and granules in the forefront can obstruct components in the background etc. Moreover, the plasma flows in a variety of directions, and flows that were once orthogonal at disc center have a line-of-sight component near the limb-see Figure 2. This means the average line profile shapes and positions also change as a function of center-to-limb. In fact, near the limb the net convective blueshift seen at disc center can disappear completely, and can even redshift [32]. Part of this effect is due to a smaller projected area near the limb, but the main driver is that velocity flows moving away from the observer are more often seen in front of the hotter plasma above the intergranular lanes and those moving towards the observer become increasing blocked by granules in the forefront [33,34]. The amplitude and exact nature of these effects will depend on the line properties and magnetic field strength; e.g., shallower lines, experiencing stronger magnetic fields, will have some of the least center-to-limb variations [35]. If this limb-dependent convective blueshift is ignored, it may significantly impact transit observations, such as the Rossiter-McLaughlin effect [36,37] and potentially transmission spectroscopy [38]; hence, it could be important for planet characterization, e.g., to further assess habitability.
Individual granules cluster together in large groups or cells known as supergranules, with diameters around ∼35 Mm. Plasma flows radially away from the centers of these cells with velocities of a few hundred m s −1 , much slower than the km s −1 rates of the smaller, individual granular cells. The origin for this phenomenon remains a puzzle, and has only been studied in detail on the Sun, but it is thought to be buoyantly driven and may be related to thermal convection-see reviews by [39,40] for more details. Observations and numerical simulations show that families of individual granules splitting can diffuse and advect magnetic flux towards the boundaries of the supergranular cells; as these families of granules interact, they can generate horizontal flows, which may contribute to organization of the supergranules ( [39][40][41][42], and references therein). The slower flow velocities, combined with projection effects, means the overall RV variability from supergranulation is lower in amplitude that from the small-scale granulation. However, the supergranule cells have much longer lifetimes, around 1.6-1.8 days for the Sun. Even though supergranulation may be lower in amplitude than small-scale granulation, its longer lifetime means it may still pose issues when searching for low-mass, long-period planets.
The motions of the plasma flows in a given stellar convective envelope excite acoustic waves that constructively and destructively interfere with one another to create stochastic standing waves, with lifetimes near ∼5 min for Sun-like stars. The most significant restoring force in the near-surface layers arises from pressure gradients and the resulting resonant modes are aptly termed p-modes (with the 'p' standing for pressure [43]). The influence of the p-modes on the observed line profiles depends on the position across the stellar disc; observations near the limb see physically higher layers in the solar atmospheres, as well as contributions from velocity flows that are both vertical and horizontal (with respect to disc center). Solar observations shown in Figure 3 display how the RV amplitudes from a small patch on a stellar disc can change from 100-200 m s −1 at the center to ∼50 m s −1 or less near the limb, and how the oscillation-induced RV variability dominates over the granulation in local patches. Over the whole stellar disc the various oscillations average down to the m s −1 level for Sun-like stars; this can be seen in the recent solar observations in Figure 4, where the well-known '5-min' p-modes can be seen with an amplitude near 0.5 m s −1 and a root-mean-square (rms) just over 1 m s −1 . Part of the scatter in Figure 4 is due to the stochastic nature of the oscillations (i.e. modes constructively and destructively interfering with one another), but part of it is also due to the contribution from granulation. The exact details of how (super-)granulation amplitudes and timescales compare with oscillation amplitudes/timescales across the HR diagram remains an open question. However, it is generally agreed that the amplitude of the oscillations in Doppler velocities tend to be higher; moreover, the variability from both (super-)granulation and oscillations increases as stars evolve since the granules become larger, with longer turnover timescales. Nonetheless, it  Sun-as-a-star observations from the PEPSI spectrograph, phased on the peak solar p-mode oscillations (subset of Figure 7 from [45]). is clear that signals from a combination of convection and oscillations can completely swamp the 9 cm s −1 amplitude from an Earth-analog. Moreover, even though these stellar phenomena occur on very different timescales to the orbit of an Earth-analog, their frequency structure will not be finely sampled by typical RV observations of exoplanets; hence, it would likely be difficult to try to separate them in the power spectrum or even to include them as a correlated noise term when modelling the RVs [20]. Hence, it is important to try to remove and/or disentangle these sources of stellar variability.

Noise Reduction Strategies
Stellar surface convection and oscillations have been studied intensely for many years; see [39,[46][47][48][49] for comprehensive reviews. However, it was not until the dawn of the exoplanet era in the 1990s and the realm of high precision RVs, nearly 20 years later, that these very interesting signals have been viewed as a nuisance that must be removed from the data. Historically, empirical (exoplanet-focused) studies have typically rolled these signals into a single 'jitter' term, that often also include variations induced by starspots and plage/faclue, as well as even instrumental noise (e.g., see [50][51][52][53][54][55][56]). These works may be useful to determine the optimal candidates for RV follow-up, but they do not explicitly focus on removing or disentangling these stellar sources from the Doppler-reflex motion induced by planetary companions. Additionally, as the astrophysical signals from oscillations and granulation are much lower in amplitude than those from the magnetically active regions, there have been relatively few attempts to remove their signatures. In particular, the current generation of instruments (HARPS [High Accuracy Radial Velocity Planet Searcher], HARPS-N, HIRES [High Resolution Echelle Spectrometer] etc.) have an instrumental precision around 50 cm s −1 , which is near the boundary of detection for convection and oscillations; this means most stellar 'noise' removal strategies have only started to scratch the surface of these lower amplitude effects. Nonetheless, it is clear that the next generation of spectrographs will demand a proper treatment of astrophysical phenomena down to the cm s −1 level to reach their full potential.

Empirically Driven Strategies
The first strategies to tackle convection and oscillations as a noise source in exoplanet RV data were empirically driven. Foremost, and perhaps most well-known and used to date, Ref. [57] worked to optimize the observational strategy to simply average out or 'beat down' the convection and oscillation noise. To determine the optimal observing strategy, Ref. [57] created magnetically quiet model stars (i.e., with only convection and oscillations as noise sources, no spots, or plage/facular regions) based on an asteroseismic data set from the HARPS spectrograph. The asteroseismology only span 5-8 days, but this should be sufficient to characterize the convection and oscillations as their timescales are on the order of minutes for their target sample of G and K dwarfs (up to 1-2 days if also considering supergranulation effects). The observed RVs were transformed in Fourier space and the velocity power spectrum density (VPSD) was calculated for each target. Then the authors performed a fit to the p-mode oscillations and granulation (they considered both granulation and supergranulation, as well as the debated mesogranulation; note, there is a long-running debate in the literature about whether or not mesogranular flows exist as a distinct scale of convectionsee [39,47], and references therein). The granulation model fits were governed by empirical solar laws derived by [58][59][60], that correspond to an exponentially decaying function. For the p-modes, Ref. [57] was only interested in the single hump of excess power in the VPSD and not the individual modes; this hump is well described by a Gaussian with full-width half-maximum equal to four times the large separation of the p-modes. Convolving such a Gaussian with the VPSD then allowed [57] to fit the p-modes with a Lorentzian function following [61,62]. Finally, randomizing the phase of each frequency in the inverse Fourier transform, and then returning to RV space meant the authors could calculate/predict oscillation and granulation-induced RVs (with timescales of a few days) for any given period in time. It is important to keep in mind, the granulation components are constructed to fit the background in the VPSD, and it is difficult to ascribe a physical meaning to them. For example, the mesogranulation component may be attributed to an artefact introduced by averaging procedures and is now largely considered to not be a distinct convection scale [39,40]. Moreover, supergranulation has only been concretely confirmed on the Sun and very little is known about its manifestation on other stars. In fact, it is possible that this background component could also be due to the occurrence of magnetic bright points, the presence of faculae, or from variations in the small-scale granulation properties (see [63], and references therein). On top of this, the combination of finite mode lifetimes and variations due to magnetic field fluctuations means there is an inherent variability that is not For each of them, we have 2 curves, the lower one is obtained with an exposure time twice longer than the upper one. The difference between these 2 curves is small, so doubling the exposure time does not strongly improve the results, although it doubles the total measurement cost (neglecting overheads). Thus, it appears that the exposure time is not the only parameter that can average out stellar noise. The frequency of measurements plays also an important role. Taking only the shortest exposure times, we are left with 3 strategies: 3 times 10-minute measurements per night with 2 hours of spacing between them (hereafter, 3N strategy), 2 times 15-minute measurements per night separated by 5 hours (hereafter, 2N strategy), and 1 measurement per night with 15 minutes exposure time (hereafter, 1N strategy). Comparing the 1 measurement-per-night of 30 minutes strategy with the 2N and 3N strategies allows to clearly see that with a similar observation time per night (30 minutes), splitting the measurements over the night in 2 or 3 blocks improve significantly the averaging of the stellar noise. The best among the considered strategies is the 3N strategy. This strategy gives values for rmsRVb in average 30 to For each of them, we have 2 curves, the lower one is obtained with an exposure time twice longer than the upper one. The difference between these 2 curves is small, so doubling the exposure time does not strongly improve the results, although it doubles the total measurement cost (neglecting overheads). Thus, it appears that the exposure time is not the only parameter that can average out stellar noise. The frequency of measurements plays also an important role.
Taking only the shortest exposure times, we are left with 3 strategies: 3 times 10-minute measurements per night with 2 hours of spacing between them (hereafter, 3N strategy), 2 times 15-minute measurements per night separated by 5 hours (hereafter, 2N strategy), and 1 measurement per night with 15 minutes exposure time (hereafter, 1N strategy). Comparing the 1 measurement-per-night of 30 minutes strategy with the 2N and 3N strategies allows to clearly see that with a similar observation time per night (30 minutes), splitting the measurements over the night in 2 or 3 blocks improve significantly the averaging of the stellar noise. The best among the considered strategies is the 3N strategy. This strategy gives values for rmsRVb in average 30 to For each of them, we have 2 curves, the lower one is obtained with an exposure time twice longer than the upper one. The difference between these 2 curves is small, so doubling the exposure time does not strongly improve the results, although it doubles the total measurement cost (neglecting overheads). Thus, it appears that the exposure time is not the only parameter that can average out stellar noise. The frequency of measurements plays also an important role.
Taking only the shortest exposure times, we are left with 3 strategies: 3 times 10-minute measurements per night with 2 hours of spacing between them (hereafter, 3N strategy) 15-minute measurements per night separated by 5 hou after, 2N strategy), and 1 measurement per night with utes exposure time (hereafter, 1N strategy). Comparin measurement-per-night of 30 minutes strategy with the 3N strategies allows to clearly see that with a similar obs time per night (30 minutes), splitting the measurements night in 2 or 3 blocks improve significantly the averagin stellar noise. The best among the considered strategies i strategy. This strategy gives values for rmsRVb in avera captured by the empirical dataset used by [57]; based on Kepler observations, this may mean an extra ∼15% uncertainty or more [63]. Nonetheless, this approach still provides a reasonable starting point for exploring granulation and oscillations on Sun-like stars.
Armed with these empirical fits, Ref. [57] created artificially noisy, yet magnetically quiet, model stars representative for G and K stars; the authors assumed an instrumental error equivalent to HARPS and that 256 nights would be available over the course of four years, on par with expectations for an intensive survey with a HARPS-like instrument (keeping in mind that most stars are only visible part of the year and that ∼20% of the time can be expected to be lost due to bad weather). To begin, the authors tested the previous HARPS-GTO (Guaranteed Time Observations) strategy of 15 min exposures on ∼10 consecutive nights per month over the 4 years. Next, the authors simply tried doubling and quadrupling the total exposure time, and/or doubling and tripling the total number of observations in a given night; in other words they tried: 1 observation per night with a 15 min exposure (previous standard), 1 observation per night of 30 min, 2 observations per night of 15 min, 2 observations per night of 30 min, 3 observations per night of 10 min, and 3 observations per night of 20 min; in addition each strategy was tested for further binning the data together from 1-10 consecutive nights. See Figure 5 for an example of the results for a subgiant and dwarf G star, as well as a K dwarf.
Ref. [57] found that doubling the exposure time had a very small effect on the overall noise reduction, while taking multiple observations in a given night did help to reduce the stellar noise more significantly. In particular, the authors claim they can remove the p-mode signatures with 15 min exposures and the granulation signature with a total of exposure time of 30 min-at least to the ∼50 cm s −1 level of their instrumental precision. For meso-and supergranulation, they argue separating the observations in a given night by 2 and 5 h, respectively, can help to average out these noise sources. Overall, Ref. [57] concluded that the best strategy for most Sun-like stars was 3 observations a night with 10 min exposure times, separated by 2 h for 10 consecutive days each month that the star is visible-as shown in Figure 5. One advantage of this strategy is the ease to universally apply it to all similar stars; however, it is important to keep in mind that once we can routinely reach a 10 cm s −1 instrumental precision this strategy may need to be adapted. Nonetheless, Ref. [57] predicted they could reach an overall RV precision of ∼30 cm s −1 for a Sun-like star and possibly down to ∼20 cm s −1 for a K dwarf with just 10 days of binning. It is important to note that these predicted precisions are only applicable to the brightest targets, and the requirement for such intense monitoring would shrink the available target list even further.
Based on a bootstraping technique, Ref. [57] calculated the false alarm probability (FAP) for a variety of detection limits [64,65]. They predicted that increasing the number of observations per night to 2 or 3 should push their detection limits down to planets ∼2-3 times as massive as the Earth. In particular, since the K dwarfs have naturally lower amplitudes in convection and oscillations, they argue that for these targets they may be able to detect habitable zones planets with a mass twice that of the Earths. However, it is not yet clear what is the true habitability and/or structure of planets in this mass regime. Moreover, it is apparent that the elusive Earth-analog would remain beyond the realm of realistic possibilities, at least in the 4 years of observational time considered here.
More recently, Ref. [66] tested similar noise reduction strategies, but with a different underlying model setup, based off the Sun-focusing specifically on solar granulation and supergranulation effects. In this case, the model star is constructed by populating a stellar disc with uprising granules and down-falling intergranular lanes, each with individual velocities. The granule size distributions, lifetimes, birth/death, splitting, and merging were all governed by empirical laws derived from the Sun [67][68][69]. The velocities of each granule/intergranular lane were based on distributions in the hydrodynamical (HD) solar simulation from [70], which spanned roughly an hour of physical time, with a ∼20 s cadence and had a physical box size of 30 × 30 Mm 2 in the horizontal direction (and 3.2 Mm in the vertical direction); note that the variations from the p-modes were filtered from the HD simulation so that the noisy model stars only contain RV variations from granulation phenomena. It is also important to note that the HD simulation was constructed only at disc center, but the authors did use the distributions of the vertical and horizontal velocity flows to extrapolate the impact of projection effects across the stellar disc; however, they are likely missing some effects from the corrugated nature of granulation (e.g., granules will obstruct other granules near the limb and some velocities flows will be visible underneath the smaller granules etc.). Additionally, as this work is based on the Sun, the laws describing the granulation properties may be difficult to extrapolate to other spectral types, where the plasma properties differ (e.g., see the simulations in [71][72][73][74], wherein hotter stars show longer lived, faster flowing granules, with greater granular-intergranular lanes contrasts etc.). Nonetheless, an advantage of this approach is that it should not be hindered by the current level of instrumental precision in a typical exoplanet-hunting spectrograph.
Unfortunately, despite their large horizontal size the HD simulations did not include supergranulation; this might have been due to the shallow simulation box size, the lack of magnetic fields, or the time-series may have been too short for the supergranules to form [70]. As such, Ref. [66] used empirical relations from the Sun to govern the supergranular properties [75,76]; note, this could prove difficult to extrapolate to other stars, where we have very little knowledge of their supergranulation properties. Once the model star was populated with the granules and/or the supergranules, it was evolved forward in time for 12.5 years (to cover a full magnetic activity cycle) with a cadence of 30 s. For a 30 × 30 Mm 2 region at disc center, the granulation-induced RVs were on the order of ±∼20 m s −1 (in line with the original HD simulation; note, the authors used 10 realizations of a 69 day period for this analysis, rather than the full 12.5 years simulated, but this should not impact the results since it is still much greater than the granulation timescale); this reduced by an order of magnitude when the full disc integrations were considered, with an rms over the 12.5 years of 0.8 m s −1 -an order of magnitude larger than the signal from an Earth-analog.
Using these model stars, Ref. [66] explored the impact of various observational strategies to maximize binning out the solar granulation and/or supergranulation RV variability. They varied the total observational time in a given night from ∼5 min up to just over 8 h, and tested the impact of spreading this time equally over a given 10 h night in intervals of 1-5 (e.g., an observing time of 2 h with 4 intervals would mean 30 min exposure times separated by 2 h each in a given night); in this way they examined the impact of numerous various exposure times. For each combination of observing time and number of observations per a night, the authors also tested the impact of binning together 1-10 consecutive nights.
For granulation alone, as seen in Figure 6, Ref. [66] found that even after smoothing the data for 1 h the RV rms was still ∼0.4 m s −1 -contradicting the claim from [57] that a 30 min duration is sufficient to bin out this noise source, and highlighting the importance of a modelling technique that does not rely on the previous generation of instrumental precision (and that can concretely discern between different activity sources). This agrees with solar observations that show the granulation structure is still clearly visible even after binning together 1 h of observations, as seen in Figure 7. The physical driver for this is likely because the granules tend to appear and disappear in the same locations. Although the granular velocities in [66] were derived from purely a hydrodynamical simulation, the underlying laws describing the birth, death, and evolution of the granules were derived from empirical solar observations that would naturally include some level of magnetic field; hence the behavior of the magnetic flux could still be governing some of the granule behaviors in the end-state model observations. this rms decreases as the time series are smoothed out. We proceed as follows: -For a given timescale, we smooth the time series using a running mean. -We compute the rms RV of this smoothed time series.
-We compute the residuals (RV times series minus the smoothed time series) and their rms RV for that scale: this shows the rms RV due to the signal at periods smaller than the considered scale. Figure 13 shows the results for granulation. With a 1-h smoothing, the rms RV is still ∼0.4 m/s (which is a much slower decrease than would be observed for white noise): this is comparable to the noise of current instruments or slightly below, but it is significantly higher than the noise of future instruments (0.1 m/s or less). It would require more than one night to reach a level of 0.1 m/s. This result also shows that averaging the RV signal due to cell lifetime (5-1 nificantly, as the minutes: it is nec timescale to obse Dumusque et over 30 min avera the rms RV is s alone.

Short-term
Since it is difficu to study the best been performed b of several stars, u tion, mesogranula a given observing observing time is (n between 1 and observing time di The computat from the whole ti Each curve in the a value of n). The time is spread ove is obtained for tw little observing ti times is that the r ferent strategies. ample, in the rang after at least eigh We have also the given observi ments taken at ran impact is very sm dominant factor i which the RV is a Finally, we co result is shown in of observing time RV reaching valu a 6-night average number of observ

Supergranu
In this section, w ules, as their con ules, although on (typically 1.8 day larger but they ha m/s instead of a Rincon 2010). 4 They do not qua processes and not g as we consider diff is a G2V-type star  Hence, the granulation results in a pink-noise-like signature that is not easily averaged out. In agreement with this, Ref. [66] argue that it would require more than an entire night of observational time to bin the granulation noise to the sub 10 cm s −1 level (also shown in Figure 6). In the case of an observing time of 26 min spread over 5 separate measurements in a given night, these authors argue it would require 6-10 consecutive nights of binning to reach the 10 cm s −1 level for granulation alone. Moreover, when injecting a 1 Earth-mass planet into 12 years of data, only planets with orbital periods shorter than 50 days could be found above the FAP, if sampling once every 8 days when the star is visible. Nonetheless, there were times when the long-period planet signals (e.g., 300 and 480 d) were distinctly visible, albeit below the FAP; as such, Ref. [66] argued the LPA (local power analysis) method was better suited to determine detection limits (since this technique compares the strength of the periodogram peak to its surrounding peaks and not to all the peaks in the periodogram [77]), and claim that it should allow long-period, Earth-mass planets to be detectable if the data is sampled anywhere from every 1-20 days over a ∼12 year period.
On the other hand, Ref. [66] found supergranulation on its own contributed to an overall rms of ∼0.3-1 m s −1 . Moreover, they found that smoothing the data over a given night, regardless of exposure time or number of observations, did not alter the rms beyond a few cm s −1 ; several days would need to be binned together to significantly beat down this stellar noise. Similar to the small-scale granulation, long-period (>12 d) Earth-mass planets could not be detected above the FAP; this time with even less distinction in the periodogram and therefore they further conclude that, even with the LPA technique, long-period, Earth-mass planets may be undetectable.
When considering the combined signal of both granulation and supergranulation, Ref. [66] argue that the best strategy is to sample each night up to 4 times, but that even after binning together 10 consecutive days of data that the rms could still be as high as 0.5 m s −1 (depending on the strength of the supergranulation). They claim the detection limits for Earth-mass planets, regardless of the sampling, were still limited to short orbital periods up to ∼40 days. Hence, even an excellent data sampling and over a decade of observational time may still preclude the detection of Earth-mass planets in the habitable zone of a Sun-like star, even for the most magnetically quiet stars. On top of this, an intensive monitoring of this caliber, combined with the need for bright targets and a desire for magnetically quiet stars, means there will be strict constraints on the potential observable targets. that this precision may be overestimated as the current observing strategies have not yet been analysed for their e↵ectiveness in averaging out noise due the inhibition of convection around magnetically active regions nor the e↵ect of bright active regions Although the granular lifetime is 5 -10 min, prominent underlying structure is clearly still evident, in both the G-band and continuum images, after averaging over a 1 hr period. It is also interesting to note that the Ca II K image also still displays cellular structure from the presence of granulation, and that the regions associated with the magnetic bright points (visible in the G-band) display the most evident structure (though this may just be a byproduct of the fact that magnetic structures are bright in the Ca II K, and therefore easier to discern). Although the granular lifetime is 5 -10 min, prominent underlying structure is clearly still evident, in both the G-band and continuum images, after averaging over a 1 hr period. It is also interesting to note that the Ca II K image also still displays cellular structure from the presence of granulation, and that the regions associated with the magnetic bright points (visible in the G-band) display the most evident structure (though this may just be a byproduct of the fact that magnetic structures are bright in the Ca II K, and therefore easier to discern).
Perhaps bridging the gap between empirically motivated and physically motivated strategies, Ref. [26] argue that it may be possible to isolate stellar lines of varying depths to mitigate convection effects in the observed RVs. The guiding principle in this approach is that stellar line depth is correlated with the net convective blueshift [79,80], with shallower lines having greater blueshift. Ref. [26] argue therein lies a linear relationship between the RVs observed with two different sets of absorption lines, with varying depths, and this relationship can be used to ascertain the contribution of the inhibition of convection from magnetic patches over an activity cycle. Moreover, they argue that this can be used to correct the RVs for the net convective blueshift variation and that doing so also corrects for some of the underlying shorter timescale convection/granulation noise. With such, they claim long-period, sub-Earth-mass planets may be detectable with future instrumentation. However, this approach hinges on analyzing the behavior of convection in a moderate to active star over the course of its magnetic activity cycle, wherein the inhibition of convection in plage/facular regions changes significantly; it is not designed to work on very quiet stars or short-term observations. There is also the added complexity that both the net convective blueshift and the overall granulation behavior are linked to the line depth. Deeper lines are primarily formed higher in the photosphere, so they experience convection differently than shallower lines due to a combination of different contrasts and velocity fields, with the deepest lines formed above the convection [81]. Along with this, a line profile corresponding to a granule will be deeper than the same profile formed within an intergranular lane; thus, the total line depth is dependent on the apparent granulation pattern and therefore linked to the granulation-induced RV shifts. To a lesser extent, the technique from Ref. [26] also relies on the assumption that the contribution from the photometric effect (i.e., dark spots or bright faculae/plage) remains constant among all stellar lines, and that the 'quiet' photosphere does not change over a magnetic cycle-both of which may not necessarily be valid at the cm s −1 level.

Physically Motivated Strategies
The previous section shows how incredibly difficult it may be to average out the combined combination of stellar oscillations and granulation to a level sufficient for Earth-analog detection. If even a decade of observations may not be sufficient to bin out these noise sources, then it is clear we need to focus our approach. As such, an exploration of the underlying physics of each noise source may be key to discerning a strategy with potential to reach the 10 cm s −1 level necessary for an Earth-twin scenario. Consequently, this section is subdivided into oscillation and convection noise reduction techniques.

Oscillations
In a recent work by [82], they isolate the impact of stellar surface oscillations and explore how fine-tuning the exposure time to the host star parameters may help significantly reduce this signal in high precision RVs. The motivation for treating the oscillation impact separate to the convection lies in their inherent frequency structure. For example, most of the power for granulation phenomena reside in a much lower frequency regime and therefore a simple low-pass filter (e.g., from a finite exposure duration) will not be sufficient to fully remove this noise source. This is one of the underlying physical drivers for why [57,66] found increasing the exposure time had little impact on reducing the RV variability in their simulations (which included granulation). Moreover, this is fundamentally different to stellar oscillations, which give rise to modes in a relatively narrow passband at a slightly higher frequency space-making them amenable to low-pass filtering, even for a variety of spectral types and evolutionary states.
To simulate their model observations, Ref. [82] constructed p-mode oscillation spectra to match the Sun and other solar-type stars. In particular, they specifically calculated radial and non-radial mode-amplitudes and relative visibilities as would be expected from typical exoplanet-hunting spectrographs (this is in marked contrast to the solar, Sun-as-a-star, instruments BISON and GOLF that observe in narrow passbands; [83][84][85]). Stars show many detectable overtones of solar-like oscillations, with the most prominent modes centered on ν max .
Once each simulated oscillation spectrum is calculated, then [82] multiple this in the frequency domain by the transfer function representative of a given exposure length and examine the remaining/residual amplitudes. For an exposure of a finite duration, the transfer function can be represented by the well-known boxcar filter ( [86], and references therein). Figure 8 shows an example of the residual amplitudes after a variety of exposure times have been applied to solar oscillations.
Two main conclusions can be drawn from the behavior observed in Figure 8. First, that despite their larger amplitude (in comparison with granulation), for a Sun-like star, the oscillations should be easily averaged out to the 10 cm s −1 level or better. Second, that the behavior of the residual 10 Chaplin et al.  amplitudes is tied to the stellar parameters at a level significant for the next generation of spectrographs; simply lengthening the exposure duration does not always result in a significant noise reduction, and sometimes may even increase the noise level compared to shorter durations. From the solar example in Figure 8, we can see that an exposure of ∼5.4 min (equal to 1/ν max ) would result in oscillation noise on the 10 cm s −1 level, but increasing the exposure to ∼ 8 min actually doubles that noise level, and further increasing to ∼16.5 min only reduces the noise by ∼1 cm s −1 .
As stated above, since the stellar oscillations are tied to the host star parameters, so too are the exposure lengths necessary to bin them down to levels suitable for hunting low-mass, long-period planets. Figure 9 shows the exposure times that [82] argue are required to reduce the oscillation-induced noise to a level sufficient for the detection/confirmation of an Earth-mass planet in the habitable zone of stars with varying effective temperatures, surface gravities, and luminosities. These results can be replicated and/or fine-tuned to particular stars using the publicly available python code OscFilter, which can be downloaded from: https://github.com/grd349/ChaplinFilter. Stars with lower mass and cooler effective temperatures, such as K dwarfs, may only require a few minutes to sufficiently reduce their oscillation noise; while, hotter and/or more evolved stars may need integration times much greater than 100 min to reach detection levels for Earth-mass planets in the habitable zone. For evolved stars that require long integration times, it may be more efficient to finely tune the observing strategy such that multiple observations in a given night are optimally spaced to beat down the oscillation noise [87]. As the next generation of spectrographs continue to come online, it is clear that exposure times should be tailored to the host star parameters to achieve optimal RV precisions without wasting precious telescope time. Moreover, if treated properly, stellar oscillations should not be the limiting factor in the future search for Earth-twins.   Figure 5 from [82], showing the exposure lengths required to reduce oscillation noise to a level sufficient for the detection of an Earth-mass planet in the habitable zone of stars along various evolutionary tracks, from 0.7-1.5 M .

Magnetoconvection
In contrast to [82], the series by [35,[88][89][90] focuses on the impact of magnetoconvection alone; their hypothesis is that the convection-induced line profile asymmetries responsible for this noise source can also be used to diagnose and disentangle it from planetary signals. State-of-the-art three-dimensional magnetohydrodynamic (MHD) simulations form the backbone of the model Sun-as-a-star observations in [90]; similar to [66], the simulation box size is too small to permit supergranulation, as such the authors focus on the small-scale granulation. These 3D MHD simulations are coupled with 1D radiative transport to synthesis an absorption line profile for each granulation snapshot. To overcome the computational demands required to create numerous observations, tiled with realistic, independent granulation patterns, the simulation output was parameterized; this was done both at disc center [88] and across the stellar disc [35].
The underlying 3D MHD simulation was produced using the MURaM code [91]; it spans ∼100 min of physical time, with a cadence close to 30 s. The physical box size was 12 × 12 Mm 2 (see Figure 2), with a depth of 2 Mm. The diagnostics from the MHD simulation were fed into the radiative transport code NICOLE [92,93] to synthesize line profiles of the Fe I 6302 Å line. The magnetoconvection in the 3D simulation naturally excites oscillations that these authors remove when parameterizing the signal from the granulation. The average magnetic field strength is 200 G, chosen to be sufficient to properly characterize the magnetic components of granulation, without strongly altering the convection characteristics (e.g., as seen in starspots).
Each pixel in a given snapshot has an individual absorption line profile, with a variety of photospheric plasma parameters. Since granules make up most of the surface area, and are naturally both bright and non-magnetic, these authors use cuts in magnetic field and continuum intensity to separate the different physical components of granulation. This creates four categories: granules (non-magnetic and bright), (dark) non-magnetic and magnetic intergranular lanes, and magnetic bright points (MBPs; bright because the intense magnetic field has evacuated the flux tubes and allows the observer to see deeper into the hotter, brighter photosphere). Since the granulation makes the stellar surface corrugated (as discussed in Section 2, and shown in Figures 2 and 7), this parameterization is performed at multiple center-to-limb angles [35]. All profiles of a given category are binned together to create four time-average components for each limb angle; note, it is creating these time-averages that kills the oscillation signature. Then the probability distributions of the component filling factors can be used in conjunction with the four time-average component profiles to generate new line profiles, with the same fundamental convection characteristics as the computationally intensive 3D radiative MHD simulation [89,90].
Using this parameterization, Ref. [90] can create a stellar grid with each tile having an independent realization of granulation and integrate over the disc to mimic stellar observations. As such, Ref. [90] created 1000 Sun-as-a-star model observations, and searched for correlations between the line profile shape and the net RV. Each model observation was assumed to have been separated by at least one granulation turnover, such that each instance is independent. Each observation also represented an instantaneous moment in time; this ignores any averaging over a finite exposure length, but since granulation patterns are known to be clearly discernible over long exposures (e.g., see Figure 7) this is unlikely to severely impact the results. They found many diagnostics derived from the line profile shape correlated strongly with the convective-induced RV shifts. It is important to note that the total granulation-induced RV rms for these model stars is only ∼10 cm s −1 , which is likely 3-4 times lower than the expected variations from solar observations [94,95]. This lower variability could originate from several sources: the average magnetic field strength is slightly higher than the quiet Sun, each tile is independent, each observation is instantaneous and independent, the time-series from the MHD could be under-sampled etc. For these reasons, Ref. [90] quote the fractional reduction in the RV rms as this should give an idea of how much telescope time could be saved by not needing to average as heavily and should scale with the true RV rms if the fundamental physics is correct. For example, a  Figure 10. The (mean-subtracted) bisector curvature (left) and normalized brightness proxy (right) as a function of granulation-induced RV for the Fe I 6302Å line from the model star simulations of [90]; the brightness was approximated by integrating the area under the line profile. A strong linear correlation is clearly discernible but would require the next generation of spectrographs and space-based photometric missions to be empirically observed.
50% reduction RV rms could mean four times less telescope time would be required to reach the same precision level; hence, even moderate noise reductions can have a significant impact.
In particular, Ref. [90] found measurements of the bisector shape [96][97][98] varied linearly with the induced RV, see Figure 10, and that removing this correlation could reduce the RV rms by ∼50% or more; note, if dividing the bisector into two or three segments (as done in the bisector span and bisector curvature) these ranges had to be fine-tuned to the particular 'C'-shape of the observed bisector for this stellar line. Diagnostics that used information from the entire line profile had some of the strongest correlations (enabling noise reduction of ∼55-60%) and were the most robust against the impact of instrumental resolution (a decrease resolution smooths out the asymmetries in the observed line profile); these included the equivalent width and V asy , which compares the profile gradient of the red wing to the blue wing [99,100]. In addition, they integrated the area under the line profile and used this as a proxy for photometric brightness; in contrast to [66], they found a strong correlation, with the largest blueshifts occurring when the model star was brightest (also shown in Figure 10). On one hand, such a correlation is expected as a larger granule filling factor should lead to a brighter star, with a larger net blueshift. On the other hand, Ref. [66] argue that such a correlation is broken by the stochastic nature of the granular evolution and the noisy relationship between observed granule size, velocity, and photometric properties. However, Ref. [90] argue that the empirical relationships used to create the model stars in [66] could potentially appear noisier than reality due to instrumental errors that are avoided when using a pure MHD-based background; that said, further analysis is required to determine the true nature of this relationship. If confirmed, then space-based simultaneous photometry could provide a very promising avenue to disentangle granulation-induced RV variability. Of course, it is important to keep in mind that [90] only examined the effects on one stellar line profile, for one spectral type, assumed a constant average magnetic field for each tile on their model star, and ignored other effects such as supergranulation and oscillations. Regardless, this shows a clear indication of the potential power of the information contained in the spectral lines, and how we can use this to diagnose and disentangle stellar variability in an efficient manner.

Towards the Future
It is clear that magnetically active stars produce astrophysical noise much greater than the 0.5 m s −1 precision offered by the current world-leading spectrographs, and therefore that this noise must be removed for these instruments to reach their true potential. However, even for magnetically active stars, an understanding of magnetoconvection is key to disentangling active region signatures from planet-induced shifts, since it is the inhibition of convection that drives these noise sources. Moreover, even the most magnetically quiet Sun-like stars will still have a convective envelop and therefore exhibit RV shifts on the 0.1-1 m s −1 from the granulation and p-mode oscillations. As a result, the next generation of spectrographs coming online now (e.g., ESPRESSO), promising precisions of 10 cm s −1 , will demand an understanding of these low-amplitude noise sources to deliver to their full capability. This is particularly critical for the search for life in the universe, as the only place we know for sure that can harbor life is the Earth, which induces a signal with a mere ∼9 cm s −1 amplitude.
Previous studies have shown us that even the most optimal observing strategies may not be able to find such Earth-analogs in a typical survey duration of ∼4 years [57]. In fact, it could take more than a decade of observations to reach such detection thresholds with current strategies [66], and even then, such confirmations may be debatable if the planet signal falls below the FAP. However, a detailed understanding of the underlying stellar surface phenomena may allow us to tease out such minute planetary signals from among the larger stellar signals. In particular, fine-tuning the exposure lengths to the stellar parameters (i.e., surface gravity, luminosity, and effective temperature) may allow us to remove the oscillation-induced noise to levels sufficient for Earth-mass planet detections in the habitable zones of Sun-like stars [82]. Moreover, even though it may not be possible to efficiently bin out the granulation noise, Ref. [90] have shown that it may be possible to disentangle it by analyzing its imprint on the stellar absorption line shapes; even if such techniques cannot remove all of the granulation noise, they may help significantly reduce the overall observation time necessary to confirm Earth-like planets-making such confirmations more observationally feasible. Nonetheless, it is important to note that while the results of [90] promise great potential for granulation noise mitigation, they need to be expanded to a variety of stellar lines, magnetic field strengths, and spectral types etc. Additionally, if the convection cannot be completely disentangled in the RVs, it may be important for detection techniques to try to differentiate this colored-noise from planetary signals-e.g., see [101,102] for how we may be able to use HD/MHD simulations to train and standardize periodograms to counteract the convection noise and appropriately attribute false alarm and planet detection probabilities amid this noise.
Naturally, such theoretically motivated noise reduction approaches need to be validated empirically with both solar and next generation stellar spectrographs. Consequently, it is a combination of empirical and physically motivated strategies that is likely the key to overcoming the barriers of astrophysical noise. This last aspect may be particularly pertinent at the 10 cm s −1 regime, which will open the door to many other low-amplitude noise sources (e.g., meridional flows [103,104] or variable gravitational redshift [105]), and may bring with it a host of unforeseen stellar noise sources. Thus, it is clear from the current literature that an understanding of the stellar hosts is crucial to the future confirmation and characterization of long-period (habitable-zone), Earth-mass planets.

Funding:
The author acknowledges financial support from the National Centre for Competence in Research (NCCR) PlanetS, supported by the Swiss National Science Foundation (SNSF).