Next Article in Journal
Task-Level Perceptions of AI Readiness Among Accounting Professionals
Previous Article in Journal
“Can’t You Count What Really Connects Us?” A Situated Qualitative Counter-Accounting for Social Ties in a Local Circular Economy for Organic Waste
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Policy Shocks and Public Attention to Digital Tax in Greece: Event-Study and Nowcasting with Google Trends Time Series

by
Stefanos Balaskas
eGovernment & eCommerce Lab (Innovation & Entrepreneurship), Department of Business Administration, University of Patras, 26504 Patras, Greece
Account. Audit. 2026, 2(2), 6; https://doi.org/10.3390/accountaudit2020006
Submission received: 1 December 2025 / Revised: 13 March 2026 / Accepted: 26 March 2026 / Published: 2 April 2026

Abstract

Digital tax reforms are implemented through staged, publicly announced milestones, yet policymakers rarely have timely indicators of whether these signals mobilize information-seeking and whether such demand can be anticipated for operational planning. We analyze monthly Google Trends series for Greece’s myDATA/e-invoicing rollout (2016–present) using preregistered event study models that separate step changes from post-event trend shifts with HAC-robust inference, and we evaluate 1–3-month predictive performance via rolling-origin cross-validation against a seasonal-naïve benchmark. Search-based attention shifts appeared most clearly in application-related queries: invoicing app terms spike around visible rollout phases (≈+34 to +38 index points over six months) and decline around VAT–myDATA alignment (≈−34 to −43). Ecosystem attention (the “Electronic invoicing” topic) exhibits large, opposite-signed movements (≈−53 around public-sector expansion; ≈+46 around VAT alignment), whereas platform terms show smaller and less regular responses; a back-office milestone produces no detectable change. In out-of-sample tests, event-aware regressions improve short-horizon accuracy for platform terms (≈40–50% MAE reduction at one month; ≈18–32% at two to three months), with series- and horizon-dependent results elsewhere. Overall, the evidence supports using search activity as an intermediate planning signal—informative about when and where guidance demand concentrates but not evidence of compliance.

1. Introduction

Digitalization of tax administration promises efficiency and transparency, but success depends on widespread user take-up and sustained use [1,2,3]. Governments are therefore expanding e-invoicing and electronic book-keeping; in Greece, the Independent Authority for Public Revenue (AADE) rolled out myDATA as national digital tax-reporting infrastructure. For accounting practice, such systems do not merely digitize communication with the tax authority: they reconfigure routine processes of invoice issuance, transaction recording, and periodic reporting, with downstream implications for reporting timeliness, error rates, and the evidentiary trail that supports audit and enforcement. Implementation is often constrained by information gaps, uneven readiness across firms, and operational strain around deadlines. Evidence from the UK’s Making Tax Digital suggests that early awareness can be limited—surveys found that many small firms initially did not know about new digital record-keeping obligations [4,5,6,7]. When knowledge and training lag behind rollout schedules, agencies face late rush-to-compliance dynamics and spikes in support demand, conditions under which misconfiguration and reporting errors become more likely [4,5,6,7,8].
Public attention is therefore not just a by-product of implementation but a mechanism shaping adoption and compliance outcomes. Behavioral public administration and applied economics show that salience and timely communications can alter citizen responses and uptake, including via low-cost informational interventions [9,10,11,12]. Field evidence suggests that reminders and peer-use cues can substantially increase adoption among initially reluctant users [13,14,15,16,17,18]. However, showing that “announcements matter” is not sufficient: for practice and theory, the key questions are which rollout milestones shift attention, whether changes are abrupt versus gradual, and whether attention differs across parts of a digital compliance ecosystem [18,19,20].
Digital trace data provides a practical way to observe these dynamics at scale. Google Trends has been widely used as a proxy for public awareness and issue salience [20,21], and search activity often responds strongly to pre-announced deadlines and policy shocks [21,22,23,24]. In the myDATA context, the goal is not merely to document spikes but to interpret attention as a measurable intermediate signal in the accounting workflow: increased searching plausibly reflects information acquisition and task initiation (e.g., registration, software choice, configuration, issuance rules, submission procedures) that precede—though do not guarantee—changes in actual reporting behavior and compliance quality. This perspective matters for accounting and auditing because attention peaks can coincide with periods of heightened process change and learning, when the risk of late reporting, corrections, or inconsistent record-keeping may increase.
A second substantive question is what users attend to when interest rises. myDATA is an ecosystem comprising the AADE portal, the “Timologio” application, third-party solutions, and broader e-invoicing compliance concepts. We therefore expect distinct “families” of search terms—platform/authority, application/tool, and ecosystem/compliance—to respond differently to the same milestone, revealing where information needs and friction concentrate [25,26].
This paper targets a gap at the intersection of digital tax infrastructure, accounting process change, and forecasting with digital traces. Prior e-government work largely studies adoption drivers and interventions, while time series research shows that Google Trends can track—and sometimes help predict—collective information demand. These strands are rarely integrated in a staged national rollout where bookkeeping routines, invoicing workflows, and compliance documentation are being reconfigured in real time.
We contribute a preregistered, fully reproducible event study of Greece’s myDATA/e-invoicing rollout that (i) estimates attention shifts around prespecified milestones using an interpretable step/ramp coding with HAC-robust inference and false-discovery control; (ii) compares responses across query families that map to distinct interfaces of the ecosystem (platform, app, and broader e-invoicing); and (iii) evaluates whether policy calendar features improve 1–3-month nowcasts relative to seasonality-only benchmarks, with bounded operational relevance for timing guidance and sizing support capacity. Crucially, search attention is treated as an intermediate signal of information frictions, not a policy success outcome: we do not observe filing behavior, platform usage, or audit findings. Instead, we use attention dynamics to indicate when and where implementation pressures are most likely to surface, motivating linkage to administrative metrics in future work.

2. Literature Review and Related Work

The most critical query and problem is: why strike search attention? When governments introduce technical compliance requirements, affected stakeholders often seek guidance immediately—via accountants, vendors, professional networks, and the web—making search activity a timely proxy for information demand [14,15]. Although searches are not compliance, they can function as an operational leading signal: spikes in queries about deadlines or procedures often precede surges in helpdesk contacts, onboarding activity, and “last-minute” scrambling that appear later in administrative data. Because attention is typically brief and volatile, monitoring searches offers a practical way to assess whether milestone communications are reaching the public during the narrow window when guidance is most likely to be absorbed [23].
This attention perspective aligns with agenda-setting and issue-attention theories. Political communication research emphasizes that media and institutional cues shape what people attend to—even if they do not determine what people believe [27,28,29,30]. Yet salience rarely persists: Downs’ “issue-attention cycle” proposes that public interest rises sharply and then fades as novelty dissipates and costs of sustained engagement emerge [31]. In digital government contexts, announcements, deadlines, and mandate phases can play a similar role to “news events,” briefly redirecting limited attention toward action-proximal questions. Our study adopts this lens by treating prespecified myDATA milestones (e.g., go-live, phased mandates, harmonization steps) as salient cues and testing whether they generate short-run bursts and medium-run shifts in information-seeking [32].
A large cross-disciplinary literature supports the use of search data as an economic and behavioral indicator, particularly for short-horizon monitoring and prediction [33,34,35]. Incorporating Google Trends has improved nowcasts of diverse outcomes, and classic work shows that informative search frequencies can enhance prediction of contemporaneous indicators beyond models using only lagged official data [21]. Search activity can also lead behavior: web queries have been shown to forecast consumer demand in several settings, consistent with “revealed interest” preceding action [24,25,26,27,28]. At the same time, well-known failures such as Google Flu Trends highlight the risks of over-interpretation: media amplification, shifting search behavior, and model instability can generate spurious signals or exaggerated effects [13]. For this reason, our design emphasizes preregistration, prespecified event timing and functional forms, and triangulation across multiple query families to reduce sensitivity to idiosyncratic spikes [36].
Our substantive context overlaps with the growing literature on VAT digitization and mandatory e-invoicing systems (e.g., Italy’s SdI, SAF-T implementations, the UK’s Making Tax Digital), which primarily evaluates compliance, revenue, and productivity effects [18]. This work often documents positive fiscal impacts and improved record-keeping, alongside transition costs. However, it typically provides limited evidence on the pre-compliance stage—public awareness, information gaps, and communication dynamics—despite their importance for successful adoption. We contribute by focusing on this earlier mechanism, i.e., whether and when target populations exhibit measurable information demand around rollout milestones, as a complement to downstream outcome evaluations.
Methodologically, we draw on interrupted time series and event study approaches, where interventions are modeled as level shifts and/or changes in slope [26]. This distinction parallels “pulse” versus “carryover” reasoning in applied settings: some events plausibly induce abrupt jumps in attention (e.g., go-live announcements), whereas others alter trajectories more gradually (e.g., harmonization steps or phased expansion). Our contribution is to apply this logic in a preregistered manner—coding the timing and shape of interventions ex ante rather than searching for breaks post hoc—thereby limiting researcher degrees of freedom and supporting interpretable estimates of medium-run impacts [37].
A practical motivation is whether these event-conditioned signals add value for short-horizon planning. Forecasting research emphasizes that simple seasonal baselines are difficult to beat at 1–3-month horizons and that improvements are often modest unless structure is well specified [7,23]. We therefore prioritize interpretability: we compare a seasonal-naïve benchmark to transparent regression-based models that incorporate prespecified policy timing as step/ramp indicators, yielding directly communicable predictions in Google Trends units [4]. Auxiliary automated methods are treated as robustness checks and are separated from the main analysis.
Overall, this study contributes to research at the intersection of public attention and digital tax infrastructure in four ways. First, it provides a preregistered event study of search attention around a national e-invoicing rollout (Greece’s myDATA) using a prespecified set of milestones. Second, it quantifies effects in interpretable units (SVI points) and contrasts patterns across platform, app, and ecosystem query families. Third, it evaluates whether incorporating policy calendars improves short-horizon nowcasts relative to a seasonal baseline, addressing an operational question relevant to support planning. Fourth, it emphasizes transparency and reproducibility by releasing code, queries, and analysis outputs to facilitate replication and extension.

Background: myDATA and E Invoicing in Greece

Greece’s myDATA (“my Digital Accounting & Tax Application”) is AADE’s digital tax-reporting infrastructure that operationalizes electronic bookkeeping (“e-books”) and supports e-invoicing. The reform aims to standardize and automate reporting, increase transparency, and reduce evasion by providing businesses and intermediaries (accountants, software providers) with a unified reporting interface. Because adoption unfolded through staged mandates and technical harmonization, information demand is expected to arrive in waves—rising around visible deadlines and mandate expansions and receding as workflows routinize.
We therefore preregister six rollout milestones that plausibly shift information demand and/or use:
(1)
myDATA production go-live (2021-10) [8];
(2)
B2G Phase 1 (2023-09), initiating compulsory e-invoicing for central government bodies [15];
(3)
Central administration full coverage (2024-01), modeled as a step-change [26];
(4)
VAT–myDATA alignment (2024-01), modeled as a slope-change reflecting gradual workflow convergence [25];
(5)
B2G extension to the rest of the public sector (2024-06) [25];
(6)
EU authorization for a domestic B2B mandate (2025-03) [16]. These are the public “beats” of implementation (press releases, circulars, deadlines) that should generate detectable shifts in information-seeking if they are salient.
To track attention, we use Google search interest and group queries into three families that map onto distinct points of interaction with the reform: platform terms (A; AADE/myDATA access points, e.g., aade, ααδε), application terms (C; invoicing app queries, e.g., timologio/τιμολόγιο), and ecosystem terms (D; broader e-invoicing topics/standards, e.g., the “Electronic invoicing” topic). This structure allows us to test not only whether attention moves at milestones but where it concentrates—on the official platform, the front-end invoicing tool, or the wider compliance ecosystem. Guided by this context, we ask the following:
RQ1: 
Do pre-specified myDATA- and e-invoicing-related events (such as announcements, deadlines, or system updates) cause discernible shifts in public attention as measured by Google search trends?
RQ2: 
Can an event-aware structural model (one that includes features for these communications or policy events) improve short-horizon nowcasts of public interest compared to baseline models that capture only regular seasonal patterns?
RQ3: 
Which families of search terms show the most significant movements in response to the events, for example, are people searching more about the official platform and its use, the companion app, or the broader ecosystem and compliance requirements?
Figure 1 summarizes the implementation timeline from 2016 to the latest month and marks the six preregistered milestones; the two January 2024 events are shown as distinct step versus slope interventions to avoid conflation and to provide a consistent reference for the results figures.

3. Data and Variable Construction

3.1. Data Source, Scope, Search Terms and Families

We use the monthly Google Trends (GT) search volume index (SVI) for Greece (geo = GR), which reports relative search interest on a 0–100 scale. The main analysis window is 2016-01 through the latest available month, ensuring a long pre-period before the first milestone and a balanced post-period across subsequent events; the full 2004+ history is used only for prespecified robustness checks (e.g., alternative seasonality and placebo tests). Following GT conventions, “<1” values are recoded to 0.5 to retain low-level variation and avoid undefined log transforms in robustness analyses. We retain the native 0–100 scale (rather than standardizing) because effects are directly interpretable in GT points and the bounded scale limits extreme influence [4,9]. For long-horizon downloads, we apply a standard stitching procedure: overlapping monthly windows are rescaled using median overlap ratios and then concatenated to preserve within-country relative levels. All code and query specifications are preregistered and released with the replication package. We include “Electronic invoicing (topic)” (D1) because it captures broad ecosystem-level attention beyond any single spelling/term. Since topic series can behave differently than term series, we report a sensitivity re-estimation using the closest term-based alternatives (e.g., “e-invoicing”) and show whether D-family conclusions are stable.
  • A (platform): brand/institutional terms linked to AADE/myDATA (incl. aade, ααδε, mydata).
  • C (app): application/”timologio” terms reflecting invoicing app searches (incl. τιμολόγιο, timologio, ηλεκτρονικό τιμολόγιο).
  • D (ecosystem): broad/e-invoicing topics and standards (incl. Electronic invoicing (topic), e-invoicing, peppol).
Greek diacritics and script variants are kept constant by term; near-duplicates (e.g., monotonic vs. polytonic variants) are addressed by choosing a single canonical query. Five salience-in-mind priority results for reporting and forecasting are aade (A2), ααδε (A3), τιμολόγιο (C1), timologio (C2), and Electronic invoicing (topic) (D1). The other terms power robustness analyses and a composite index.

3.2. Construction of Outcomes, Design Matrix and Event Indicators

All series are indexed at a monthly frequency (MS) on a shared calendar [33]. GT partial months are truncated to avoid look-ahead until finalized. Missing values are rare; when they occur, we treat them as the reported SVI (with the 0.5 convention for “<1”). The baseline specification does not log-transform outcomes, so effects remain interpretable in GT points; log-scale models (percent effects) are reported as robustness. For secular drift, (ii) month-of-year fixed effects (February–December; January omitted) for deterministic seasonality are used; and (iii) two COVID pulses (2020-03 and 2020-04) to absorb the abrupt pandemic shock without imposing a permanent break are used. We exclude a quarter_end indicator from the baseline because it is redundant with month fixed effects and adds little incremental explanatory power in preliminary diagnostics; it is retained in robustness. This parsimonious specification is preregistered and applied uniformly across series [4,8,22].
Each policy milestone is encoded ex ante as a step and/or slope component. For event e at date T e , the level shift is S t ( e ) = 1 { t T e } , and the post-event ramp is P O S T t e = m a x 0 , t T e in months.
We prespecify six milestones: myDATA go-live (2021-10; step + slope), B2G Phase 1 (2023-09; step + slope; robustness allows lag +1), central administration full (2024-01; step-only), VAT–myDATA alignment (2024-01; slope-only), B2G rest-of-public (2024-06; step + slope), and EU B2B authorization (2025-03; step + slope). The January 2024 split (central administration as step-only; VAT–myDATA alignment as slope-only) reflects distinct implementation logic, i.e., coverage completion versus phased process harmonization.

3.3. Identification and Preregistration

3.3.1. Identification Strategy

We exploit publicly announced myDATA/e-invoicing milestones and encode them ex ante as step and/or post-event slope terms [4,9]. Identification is the within-series, event-timed association around those dates, conditional on a centered trend, deterministic seasonality, and short COVID pulses [4,9]. Milestone timing is largely predetermined by administrative/regulatory sequencing (go-live, phased mandates, harmonization steps), which motivates treating the calendar as fixed from the analyst’s perspective; however, the design cannot exclude time-varying confounds that may co-move with these milestones (e.g., media cycles, vendor campaigns, enforcement signals, or concurrent guidance). We therefore interpret coefficients as shifts in search attention coincident with milestones, not causal effects on compliance or adoption outcomes.
For each outcome series y t (monthly SVI), we estimate:
y t = α + γ t c   + m = 2 12 δ m 1 { month = m }   + e ( β S , e S t ( e ) + β P , e P O S T t ( e ) )   + p { 2020 - 03 , 2020 - 04 } π p 1 { t = p }   + u t
Primary estimation uses OLS with Newey–West HAC (6) standard errors. We summarize each event’s medium-run association at horizon T months as Δ T = β S , e + T β P , e and treat T = 6 as the preregistered main estimand. To avoid anchoring interpretation on a single horizon, we report Δ 3 , Δ 6 , and Δ 12 side-by-side. p-values are adjusted using Benjamini–Hochberg’s FDR within each series across events; comparisons across series/families are presented as descriptive unless additionally adjusted [4,9].

3.3.2. Robustness and Preregistration

The study was preregistered to document the event calendar, coding rules (including the January 2024 split), estimands, and planned sensitivity checks. Preregistration reduces researcher discretion and supports reproducibility, but it is not a substitute for model evaluation; we therefore treat robustness and diagnostic checks as central to interpretation.
We implement placebo tests by shifting the full event calendar backward by −12, −18, −24, and −30 months and re-estimating Δ 6 under the identical specification. Placebo performance is summarized by (i) the share of placebo Δ 6 estimates significant after BH–FDR within series and (ii) the empirical distribution of placebo Δ 6 relative to the true-timing Δ 6 for each event and series. Non-trivial placebo significance is treated as a threat—consistent with residual structured variation not captured by baseline controls—rather than as confirmatory evidence [32,33]. Where placebo false positives persist, we add a sensitivity specification that strengthens seasonal structure using harmonic (Fourier) terms and report whether the main-timing estimates remain separated from placebo distributions (Appendix A and Table 1).

4. Methods

4.1. RQ1: Event-Study OLS

We estimate event-driven changes in monthly Google Trends attention using a transparent linear specification with calendar controls and preregistered policy dummies. For each outcome series yt (SVI on the native 0–100 scale) indexed by month t, we fit:
  y t = α + γ t c   + m = 2 12 δ m 1 { m o n t h = m }   + e ( β S , e S t ( e ) + β P , e P O S T t ( e ) )   + p { 2020 - 03 , 2020 - 04 } π p 1 { t = p }   + u t
The centered linear trend t c captures secular drift; month fixed effects (February–December; January omitted) capture seasonality, and two one-month pulses absorb the abrupt COVID shock (4 March 2020). Policy milestones are encoded ex ante as step indicators S t e = 1 { t T e } and/or post-event ramps P O S T t e = m a x ( 0 , t T e ) in months. The baseline excludes quarter_end to avoid redundancy with month fixed effects.
Primary estimation uses OLS with Newey–West HAC (6) standard errors (preregistered). For interpretation, we summarize each event’s impact at horizon T { 3 ,   6 ,   12 } months via the linear combination:
Δ T β S , e + T β P , e
and report HAC-robust 95% confidence intervals computed from the robust covariance of β S , e β P , e . To address multiplicity within each outcome series, we apply Benjamini–Hochberg’s FDR adjustment to event-level p-values. Our primary estimand is Δ 6 because it provides an operationally meaningful medium-run window while remaining close to the policy period.
Serial dependence is assessed using Ljung–Box and ACF/PACF diagnostics; when residual autocorrelation is substantial, we report an AR (1) error variant (SARIMAX with identical exogenous regressors) as a robustness check in the Appendix A. All top-line inferences are based on the preregistered OLS + HAC specification for transparency and comparability across series. Our preregistered confirmatory claims center on the five priority outcomes × six events. For these “main-claims” tests, we report both (i) BH–FDR within series across events and (ii) a pooled BH–FDR across the full priority family (5 × 6) to align inference with cross-series narrative comparisons. Results outside the priority family are treated as secondary/exploratory and are described without strong inferential language. January 2024 is encoded as two distinct interventions because they represent conceptually different mechanisms: a back-office coverage completion (step-only) versus a workflow harmonization process expected to accumulate gradually (slope-only). This split was preregistered to avoid post hoc tailoring.

4.2. RQ2: Nowcasting Design

We evaluate short-horizon predictive value using blocked rolling-origin cross-validation that mimics real-time deployment. Origins start in 2018-01 and advance in 6-month increments; at each origin, models are trained on up to the previous 48 months and forecast h { 1 ,   2 ,   3 }   months ahead [4,9,16]. All features are deterministic functions of calendar time and prespecified policy dates, ensuring strict no-leakage: design matrices for t + 1 , , t + h are constructed using only trend continuation, month indicators, COVID pulses, and event step/ramp rules fixed ex ante. We compare three transparent forecasters:
  • SNAIVE (12): y ^ t + h = y t + h 12 .
  • OLS + events: the structural regression with trend, month fixed effects, COVID pulses, and prespecified event indicators.
  • OLS + events + AR (1): the same exogenous specification estimated with AR (1) errors (SARIMAX) to capture residual autocorrelation.
Forecast accuracy is summarized by MAE and RMSE (primary), with sMAPE/MASE reported for completeness; we also report percentage MAE improvement relative to SNAIVE (12). Statistical comparisons against SNAIVE (12) are conducted using Diebold–Mariano tests with absolute error loss (series × horizon), reported concisely.
Prediction intervals (80% and 95%) are constructed from out-of-sample residual dispersion (Gaussian bands) with a seasonal block bootstrap used as a robustness option to respect monthly dependence. Additional baselines, blends, and extended diagnostic plots are reported in the Appendix A to preserve readability.
Auxiliary machine learning (ML) models are treated as exploratory robustness and are reported in the Appendix A. Specifically, we test low-capacity global residual learners that predict r t + h = y t + h y t + h 12 using deterministic calendar/event features (trend, month/quarter harmonics, event step/ramp indicators, COVID pulses) and then add predicted residuals back to SNAIVE (12). These models are intentionally regularized (shallow trees/small MLP with early stopping) to limit overfitting in short panels; we only highlight ML results when they match or exceed the best structural model consistently across rolling splits for a given (series, horizon). The structural models remain the default due to interpretability and replicability. Forecast value is evaluated series-by-series and horizon-by-horizon; we report instances where the event model underperforms SNAIVE (12) as failures, not exceptions. We quantify uncertainty in MAE differences using a paired bootstrap across forecast origins, and we report empirical PI coverage for nominal 80%/95% intervals.

4.3. RQ3: Which Families Move?

RQ3 summarizes heterogeneous attention responses across three query families—platform (A), app (C), and ecosystem (D)—using the RQ1 event study estimand. Family anchors are the five priority outcomes: platform = { A 2 , A 3 } , app = { C 1 , C 2 } , ecosystem = { D 1 } . We classify “movement” for each (event, series) using the sign and BH-FDR significance of the primary medium-run estimand Δ 6 :
  • ▲ if Δ 6 > 0 and BH-FDR p 0.05 ;
  • ▼ if Δ 6 < 0 and BH-FDR p 0.05 ;
  • ○ otherwise.
To distinguish abrupt versus gradual responses, we append “(S)” when either the step component β S , e or slope component β P , e is significant for BH-FDR even if Δ 6 is marginal, indicating the dominant driver (level vs. ramp). Family-level summaries aggregate these classifications across the relevant anchors (A2–A3, C1–C2, and D1).
As a family-agnostic summary, we construct a composite attention index as the mean of z-scored priority series (A2, A3, C1, C2, D1); PCA-1 is used as a robustness alternative. We re-estimate the same event study model on the composite and apply the same Δ 6 and BH-FDR decision rules. Planned sensitivity checks mirror the preregistered robustness set: HAC (12), event lags (+1/+2), log-scale outcomes, STL-deseasoned outcomes, and placebo events shifted −24 months; where AR dependence is strong, the AR (1) error variant is reported as a stability check in the Appendix A.

5. Data Analysis and Results

5.1. RQ1—Event Impacts

Several preregistered milestones coincide with substantial medium-run shifts in search attention, with multiple effects surviving BH–FDR adjustment (Table 2). We focus on Δ 6 in SVI points (0–100) as the primary effect scale; percent-of-baseline values (Table 3) are provided only to contextualize magnitude across series with different baselines and can be mechanically large when baseline SVI is low.
Two milestones show the clearest and most systematic signatures. First, the B2G rest-of-public expansion is associated with a sharp decline in ecosystem attention and a simultaneous rise in an app query: Electronic invoicing (D1) decreases by Δ 6 = 52.7 SVI (q < 0.001), while τιμολόγιο (C1) increases by Δ 6 = + 29.7 (q < 0.001). Second, VAT–myDATA alignment shows a strong reallocation across families: D1 increases by Δ 6 = + 46.4 (q < 0.01), while app queries decline (C2 Δ 6 = 34.3 , q < 0.01; C1 Δ 6 = 43.1 , q < 0.05).
Earlier rollout milestones primarily affect app terms. myDATA go-live is linked to a large increase in timologio (C2) ( Δ 6 = + 38.4 , q < 0.001), with a smaller increase in α α δ ϵ   (A3) ( Δ 6 = + 15.1 , q < 0.05). B2G Phase 1 is associated with increases in both app series (C1 Δ 6 = + 33.6 , q < 0.01; C2 Δ 6 = + 23.8 , q < 0.01). In contrast, the back-office “central administration full” milestone shows no detectable shifts across series, consistent with low public salience. EU B2B authorization yields small and mostly non-significant changes, with a modest positive effect for C1 ( Δ 6 = + 8.2 , q < 0.05) (Table 2).
Taken together, the pattern is heterogeneous by query family and rollout phase. The ecosystem topic (D1) exhibits the largest opposite-signed movements (down after B2G rest-of-public; up at VAT–myDATA alignment), while app queries show the clearest “rollout spike” signature at launch-type milestones (go-live; Phase 1) followed by declines at harmonization (VAT–myDATA alignment). Platform terms (A2/A3) shift more modestly and less consistently, suggesting broader and more diffuse search intent.
Figure 2 and Figure 3 visualize these results with HAC-robust 95% confidence intervals; estimates whose intervals exclude zero correspond to entries significant for BH–FDR in Table 2. Supplementary Tables S1–S10 report the underlying step and slope components and confirm that the headline Δ 6 findings are driven by persistent post-event dynamics (not isolated one-month spikes) for the main ecosystem effects.

Robustness and Sensitivity Checks

The headline RQ1 conclusions are stable across preregistered robustness dimensions. Using HAC (12) instead of HAC (6) leaves signs and key inferences unchanged: D1 remains strongly negative after B2G rest-of-public and strongly positive at VAT–myDATA alignment; app terms remain positive at go-live/Phase 1 and negative at VAT alignment; platform responses remain smaller and less regular (Table 4). Shifting event timing forward by +1/+2 months strengthens fit for some milestones (notably ecosystem responses around VAT alignment and B2G expansion), consistent with short implementation/awareness lags. Log-scale estimates preserve the same directional patterns but can imply very large percentage changes for low-baseline periods; we therefore treat percentages as contextual only (Table 5). STL deseasoning produces the same marquee signs, and placebo events shifted −24 months yield no consistent family-by-event pattern, supporting interpretation as event-timed shifts rather than generic seasonality or drift (Table 6).

5.2. RQ2—Nowcasting Skill

We assess out-of-sample nowcasting with blocked rolling-origin cross-validation (origins every 6 months from 2018; max training window 48 months; horizons h = 1 ,   2 ,   3 ). Performance is summarized by MAE (Figure 4) and percentage MAE improvement in OLS + events over a seasonal-naïve benchmark, SNAIVE (12) (Figure 5).
Two patterns emerge. First, platform queries (A2/A3) benefit most from adding trend, seasonality, and preregistered event indicators. OLS + events reduces MAE by about 40–50% at h = 1 and 18–32% at h = 2–3 relative to SNAIVE (12), and it is the lowest-MAE model for both platform series across horizons (Table 7). Diebold–Mariano tests indicate a clear advantage for A3 at h = 1 (t = −2.41, p = 0.034) and a marginal advantage for A2 at h = 1 (t = −2.09, p = 0.061), with remaining contrasts not statistically distinguishable given the small number of origins (Table 8).
Second, gains for app and ecosystem queries are horizon-dependent. For C2 (timologio), OLS + events yields a small improvement at h = 1 (~1–2%) but larger gains at h = 2–3 (~14–15%), consistent with event structure becoming informative once beyond month-to-month noise. For C1 (τιμολόγιο), OLS + events is slightly better at h = 1 (~4–5%), while SNAIVE (12) wins at h = 2–3, indicating stronger annual recurrence in that spelling variant. For the ecosystem topic D1, SNAIVE (12) dominates at h = 1 , whereas OLS + events overtakes at h = 2–3 (~4% and ~13% improvement), consistent with D1’s larger, slower-moving event effects and high short-run volatility. Overall, event-aware structure improves short-horizon forecasts where attention shifts are relatively persistent and policy-timed (especially platform terms), while pure seasonality remains difficult to beat for series with strong annual recurrence (C1) and at the very shortest horizon for a volatile ecosystem series (D1 at h = 1 ). Adding short-memory dynamics (AR terms) and simple blends narrows some h = 1 gaps in auxiliary comparisons, but the central result remains: preregistered event coding carries forecasting value that is both series- and horizon-specific. Figure 6 provides an illustrative nowcast example (A3) with 95% prediction intervals; interval width increases with horizon, as expected.

5.2.1. Rolling-Origin CV

We evaluate out-of-sample accuracy using the preregistered blocked rolling-origin design (origins from 2018-01 every 6 months; max training window 48 months; horizons h = 1 ,   2 ,   3 ). Across series, OLS + events is the top-performing structural model for the platform queries (A2, A3) and for C2, while SNAIVE (12) remains competitive for C1 at longer horizons and for D1 at h = 1 . SARIMAX + events underperforms across series and horizons (Appendix A Table A5, Table A6 and Table A7).
Platform gains are large and consistent. For A2, OLS + events reduces MAE by ~40% at h = 1 (3.83 vs. 6.42), ~18% at h = 2 (6.38 vs. 7.81), and ~17% at h = 3 (6.36 vs. 7.63). For A3, improvements are larger: ~50% at h = 1 (6.04 vs. 11.96), ~29% at h = 2 (8.59 vs. 12.10), and ~32% at h = 3 (8.72 vs. 12.76).
For app and ecosystem terms, gains depend on horizon. C2 improves modestly at h = 1 (~1.5%; 21.35 vs. 21.67) but more at h = 2 3 (~14–15%; 16.21 vs. 18.83; 14.35 vs. 16.94). C1 is strongly seasonal: OLS + events is slightly better at h = 1 (~4.5%; 7.88 vs. 8.25), while SNAIVE (12) wins at h = 2 3   (6.83 and 5.44). For D1, SNAIVE (12) is best at h = 1 (6.33 vs. 9.30), whereas OLS + events becomes better at h = 2 3 (~4% and ~13%; 7.79 vs. 8.13; 9.08 vs. 10.42). RMSE mirrors MAE (Figure 4 and Figure 5).
Figure 7 plots one-step-ahead backtest paths and shows that errors concentrate around turning points and event windows—precisely where event indicators add most predictive value.

5.2.2. Forecast Comparison

Extending beyond the two-model comparison, simple hybrids that combine seasonal persistence with event structure (and, where helpful, a light AR term) are typically the most robust across series–horizon cells. In practice, blends reduce variance relative to a single model while preserving the main gains identified above: large improvements for platform queries, moderate improvements for C2, and limited scope for improvement where annual recurrence dominates (C1 at h = 3 ) or where short-run volatility is high (D1 at h = 1 ). SARIMAX-style specifications remain dominated in this setting and are therefore not emphasized.
Prediction intervals derived from rolling-origin residual dispersion show reasonable near-term calibration for operational nowcasting (coverage typically moderate in short samples), with wider uncertainty around volatile series and near event windows. Figure 8 and Figure 9 illustrate representative nowcasts and prediction cones for the platform series.

5.3. RQ3—Which Families Move?

BH–FDR-screened Δ 6 classifications show that systematic movement concentrates in the application (“app”) queries (family C). Across 12 app-family event × series cells, 8 are significant (67%). τιμολόγιο (C1) moves at five out of six milestones—up at myDATA go-live and B2G Phase 1, up again at B2G rest of public and EU B2B authorization, and down at VAT–myDATA alignment. timologio (C2) moves at three out of six milestones—up at go-live and Phase 1 and down at VAT alignment. The direction is event-coherent: launch-like milestones are followed by positive app attention, whereas harmonization (VAT alignment) is followed by negative ramps, consistent with declining “how-to app” search once workflows stabilize.
Ecosystem attention (D1) is selective but large. Only two out of six cells (33%) are BH-significant, but both correspond to the largest directional ecosystem shifts: up at VAT–myDATA alignment and down at B2G rest-of-public. Platform terms (family A) are weakest and least systematic: 2/12 cells (17%) are significant (A3 rises at go-live; A2 falls at B2G rest-of-public). No series moves at “central administration full,” consistent with a back-office milestone with low public salience. Mechanistically, rollouts appear more “level/step-like” for app searches, while harmonization/coverage changes manifest as slopes/ramps (e.g., D1 up at alignment; C1/C2 down thereafter). These family patterns mirror RQ1’s Δ 6 ordering and help explain RQ2: event-augmented models add most value where movement is systematic (platform and C2), while season-only baselines remain competitive where dynamics are dominated by recurrence (notably C1 at longer horizons). Table 9 summarizes the BH–FDR movement grid.

5.3.1. Rank by |Δ6| for Each Series

Ranking events by Δ 6 within each series reinforces the family profile. App queries dominate the upper tail: C1’s largest movements occur at VAT alignment (−43.1) and B2G Phase 1 (+33.6), with a further large rise at B2G rest-of-public (+29.7); C2 peaks at go-live (+38.4) and falls sharply at VAT alignment (−34.3). Ecosystem D1 shows the single largest swings overall—−52.7 at B2G rest-of-public and +46.4 at VAT alignment—highlighting a strong but event-selective response. Platform effects are smaller (A3 notable at go-live; A2 notable at B2G rest-of-public), and central administration full remains near-neutral across series. Table 10 reports these ranks.

5.3.2. Composite Index Confirmation (Z-Mean Across A2/A3/C1/C2/D1)

A family-agnostic composite attention index (z-mean of A2/A3/C1/C2/D1) corroborates broad increases at myDATA go-live ( Δ 6 0.52 , p < 0.001 ) and B2G Phase 1 ( Δ 6 1.49 , p 0.002 ), no effect at central administration full, and a moderate decline at VAT–myDATA alignment ( p 0.035 ). This aggregate pattern is consistent with the grid and ranks: rollouts lift app-centric attention, while subsequent harmonization diffuses or reverses attention in the composite. Table 11 reports the composite estimates.
The “top mover” character explains two earlier findings: (a) in RQ1, app terms have highest Δ6 magnitudes at rollouts and roll-backs at harmonization; and (b) in RQ2, event-based models realize the most clear-cut gains on A/C families at near horizons—exactly where those families record most frequent and largest Δ6 movements.
Based on the RQ3 grid, platform interest (A3) shifts at myDATA go-live (level + slope); the backtest is OLS + events to capture that step change and early ramp (e.g., 2022-02: actual = 65, OLS ≈ 63, SNAIVE (12) = 40). Subsequently, when attention returns to normal or surges above pre-set policy milestones, OLS + events will occasionally over- or under-shoot (e.g., 2022-08 and 2024-08), while SNAIVE (12) might pick up purely seasonally rebounding. Collectively, the path perspective addresses why event-enhanced models produce meaningful short-horizon returns on platform requests (RQ2) but diminishing edges once the shock of initial rollout has faded—echoing RQ3’s finding that platform movement is most pronounced at go-live with progressively less systematic movement at subsequent checkpoints.
In Figure 10, rolling-origin cross-validation (h = 1 month) between an event-augmented regression (“OLS + events”) and a seasonal-naïve baseline (“SNAIVE (12)”) based on the t–12 value is shown. Origins roll each half-year with a 48-month maximum training window; forecasts are graphed at issuance date. The panel indicates that OLS + events follows the post-go-live increasing trend more closely than SNAIVE (12) and more accurately forecasts the 2021–2022 platform interest ramp-up. Note, evaluation uses Newey–West HAC (6); for ααδε (A3) at h = 1, the average MAE for OLS + events is 6.04, compared to 11.96 for SNAIVE (12), and the Diebold–Mariano test favors OLS + events (t ≈ −2.41, p ≈ 0.034).
Across these 12 representative checkpoints, event-augmented regression is closer to the true SVI in 7/12 months (e.g., 2020-02: 22 vs. 20; 2022-02: 63.3 vs. 65; 2024-02: 68.7 vs. 78), whereas SNAIVE (12) is closer in 5/12 (particularly during GFC peak periods like 2021-08, 2022-08, 2024-08, and early 2025). This is in line with the RQ3 and overall RQ2 findings: near policy rollouts and ramps, OLS + events more closely approximates the level shifts and medium-run trend in platform interest, and distant from milestone shocks, seasonal reversion is useful for SNAIVE (12). In other words, platform searches react highly at go-live and early harmonization (picked up by the event model), but the rest are increasingly seasonal, where a seasonal-naïve base is a contender (Table 12).

6. Discussion

This study examined whether prespecified milestones in Greece’s myDATA/e-invoicing rollout were associated with changes in public search attention (RQ1), whether encoding those milestones improves short-horizon nowcasts of attention (RQ2), and which query families respond most (RQ3). The results are best interpreted as evidence about salience and information-seeking—an intermediate signal of user attention and potential onboarding friction—rather than as evidence of compliance, reporting accuracy, or policy success. Consistent with issue-attention perspectives, “front-stage” milestones tend to coincide with abrupt, short-lived shifts in task-oriented searches, whereas harmonization and coverage changes are more often reflected in gradual ramps or drawdowns.

6.1. Do Events Move Attention?

Across the preregistered milestones, the strongest and most consistent responses appear in the application (“app”) family. In particular, timologio (C2) rises at go-live (Δ6 ≈ +38.4 SVI) and C1 rises at B2G Phase 1 (Δ6 ≈ +33.6), while both fall at VAT–myDATA alignment (C1 Δ6 ≈ −43.1; C2 Δ6 ≈ −34.3). The ecosystem topic Electronic invoicing (D1) shows large, opposite-signed medium-run shifts—an increase around VAT–myDATA alignment (Δ6 ≈ +46.4) and a decline after the B2G rest-of-public expansion (Δ6 ≈ −52.7). Platform terms are more heterogeneous and often weaker: ααδε (A3) increases at go-live (Δ6 ≈ +15.1) whereas aade (A2) declines after B2G rest-of-public (Δ6 ≈ −22.2), with other effects small or indistinguishable from zero after FDR adjustment. Central administration full (Jan 2024) shows no measurable shift, consistent with a back-office milestone lacking a user-facing call to action.
Two interpretation boundaries are important. First, we prioritize Δ6 in SVI points on the native 0–100 scale; this avoids the mechanical inflation that can arise when percentage changes are computed from low baselines. Percentage effects from log-scale robustness checks can appear extreme (e.g., ±300–500%) when pre-event SVI levels are near zero, even if the absolute change remains modest. Accordingly, we treat percentage changes as descriptive robustness and interpret them only alongside absolute SVI point shifts and baseline levels. Second, increased searching plausibly reflects information acquisition and task initiation (e.g., registration, software selection, configuration, issuance rules), but it does not establish uptake, compliance quality, or reporting accuracy.
Within these limits, the pattern of step-like jumps for app queries around launch-type milestones and slower ramps around harmonization is consistent with standard intervention logic (level versus slope responses) and with attention allocation arguments: salient announcements trigger short-lived, action-proximal “how-to” demand, while harmonization reshapes workflows and shifts attention more gradually toward ecosystem-level concerns [9,16,23]. For operations, the practical implication is not that the policy “succeeded” but that certain milestones predictably coincide with concentrated information demand in specific parts of the ecosystem.

6.2. Do Event-Aware Models Forecast Better?

Encoding the prespecified policy calendar improves short-horizon forecasts relative to a seasonality-only benchmark in several series, with the clearest gains for platform terms at h = 1 and for app/ecosystem series at longer horizons. In rolling-origin validation, OLS + events reduces MAE versus SNAIVE (12) by roughly 40–50% at h = 1 for platform series (A2/A3), with more variable gains at h = 2 3 and horizon-dependent benefits for app and ecosystem outcomes. This pattern is consistent with a simple point: when variance is partly driven by interpretable, dated shocks (launches, phased mandates, harmonization), deterministic step/ramp indicators capture structure that seasonal repetition alone misses [12,20,23,28].
Horizon heterogeneity is informative rather than speculative. For example, timologio (C2) shows limited incremental value at h = 1 but clearer gains at h = 2 3 , consistent with the idea that medium-run ramps become predictable once month-to-month idiosyncrasy averages out. For D1, event terms add little at h = 1 but become more useful at h = 2 3 as slope components accumulate [9,16,30]. Importantly, these are forecasting improvements in attention, not proof of behavioral change; their value is operational: planning the timing of guidance, helpdesk capacity, and vendor coordination around known milestones. The results also align with the cautionary lesson from search-based forecasting critiques: structured, theory-consistent features can help, but idiosyncratic spikes and evolving search behavior limit one-size-fits-all gains.

6.3. Which Families Move?

The family ordering—app (C) strongest, ecosystem (D) second, platform (A) weakest—provides a compact summary of where attention concentrates during staged rollouts. App terms are closest to immediate tasks (issuing invoices, onboarding), so they exhibit sharper step-type responses around salient milestones; ecosystem terms tend to reflect rule changes, standards, and workflow reconfiguration, so they are more often expressed as ramps; platform/brand terms aggregate heterogeneous intents and are less diagnostic except at headline moments. The composite index corroborates that these dynamics are not driven by a single series [9,16,23].
Operationally, the implication is a sequencing logic rather than a success claim. Agencies can anticipate app-focused information demand around launch-type milestones (staffing and onboarding materials), while harmonization windows call for sustained ecosystem-oriented guidance (standards, procedures, vendor alignment). Conversely, purely administrative completions may remain low-salience and should not be expected to shift public attention without complementary communications.
Overall, the contribution is not that search attention equals compliance or performance but that a preregistered, transparent event-study design can (i) characterize heterogeneous attention responses across milestones and query families and (ii) yield modest but actionable improvements in short-horizon nowcasts of attention that support communication and support planning in digital tax rollouts [12,20,23,28]. Table 13 summarizes which conclusions are robust to the prespecified checks and where interpretation remains sensitive.

7. Practical Implications

This study treats Google Trends as an intermediate signal of salience and information-seeking around myDATA/e-invoicing milestones. The implications are therefore operational for communication timing, support capacity, and coordination rather than evidence of compliance or policy success.

7.1. For Policymakers and Tax Administrators

Launch-type milestones (e.g., go-live, early mandate phases) coincide with short-lived spikes in app-focused searches. Communications should therefore be concentrated immediately before and during these dates, using highly practical content such as checklists, step-by-step instructions, and common error fixes. Helpdesk capacity and overflow procedures should also be aligned with these windows. By contrast, harmonization milestones are associated with slower shifts, especially in ecosystem queries and are better supported through staggered guidance over several months, including rule clarification, edge cases, workflow updates, and vendor alignment [12,20,23,28]. Event-aware nowcasts and prediction intervals can support staffing calendars and service-level planning around visible milestones, whereas null effects for back-office milestones suggest limited need for front-line surge capacity unless additional communications are introduced. More broadly, a simple step-versus-slope lens can guide sequencing: pair rollout phases with app-centered calls to action and treat harmonization as a longer support window. Finally, Δ6 (in SVI points) and the composite attention index can serve as lightweight monitoring signals alongside slower operational indicators, such as tickets or onboarding volumes, to detect when attention is lower than expected and outreach may need adjustment [12,20,23,28].

7.2. For Business Managers and Software Vendors

Vendors should prepare short, time-bounded increases in support and in-product guidance around launch-type milestones, especially for first-use tasks such as invoice issuance, corrections, and submission. This may include temporary increases in chat/support staffing, date-specific prompts, and short workflow-based micro-guides. Because ecosystem attention shifts more gradually during harmonization, vendors can schedule deeper materials such as integration guides, compliance tutorials, and API examples over the following months and align releases accordingly. Shared event-aware forecasts can also help vendors and public authorities coordinate webinars, sandbox windows, and change freezes during high-uncertainty periods, reducing conflicting signals to users. Since platform/brand searches are broader and less event-sensitive, clear signposting and plain-language navigation remain important, particularly for less digitally literate users who may not search using app-specific terms.

8. Conclusions, Limitations, and Future Directions

This study examined whether prespecified milestones in Greece’s myDATA/e-invoicing rollout were associated with shifts in public search attention (RQ1), whether encoding those dates improves short-horizon forecasts of attention (RQ2), and which query families respond most strongly (RQ3). Using preregistered step/ramp indicators on monthly Google Trends data (2016–present) with HAC-robust inference and BH–FDR adjustment, we find a consistent ordering: app queries respond most clearly around launch-type milestones, ecosystem attention shifts more gradually, and platform terms are smaller and less regular; the back-office “central administration full” milestone is near-neutral. Event-aware models also improve out-of-sample nowcasts relative to a seasonal-naïve benchmark for some series and horizons, with the clearest gains in selected short-horizon cases [12,20,23,28]. Overall, policy timing appears to structure information-seeking in measurable ways, while search attention remains an intermediate signal rather than a compliance outcome.
These findings should be interpreted with caution. Google Trends captures attention, not adoption, compliance, or audit outcomes. Large percentage changes can partly reflect low baselines, so we prioritize SVI point effects in interpretation. The design is observational, and time-varying confounds may still coincide with milestones despite controls for trend, seasonality, and COVID pulses. Placebo results are therefore treated as stress tests, and non-trivial placebo significance reinforces cautious, non-causal reading. In addition, GT scaling, stitching, and topic-versus-term differences may affect comparability, although we address these issues through validation and sensitivity checks.
Future research should link attention signals to administrative and behavioral outcomes such as helpdesk tickets, onboarding completion, active user counts, or e-invoice submissions to test whether search attention has measurable operational lead value. Richer designs could incorporate communication intensity and ecosystem activity, including media coverage, vendor releases, and professional association notices, in order to separate policy timing from concurrent narrative shocks. Higher-frequency data could also be used to examine anticipatory spikes and post-event decay, while subgroup analyses by region, industry, or user type could clarify who responds to which milestones [12,22,23]. Finally, applying the same preregistered framework to other digital tax reforms such as Making Tax Digital, SAF-T, or national e-invoicing mandates would test portability and help build comparative evidence across common rollout archetypes.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/accountaudit2020006/s1.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the author on request.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

Table A1. Event timing sensitivity: +2-month lag (Δ6, HAC = 6). Δ6 re-estimated after shifting each event forward by +2 months.
Table A1. Event timing sensitivity: +2-month lag (Δ6, HAC = 6). Δ6 re-estimated after shifting each event forward by +2 months.
Eventaade (A2)ααδε (A3)τιμολόγιο (C1)timologio (C2)Electronic Invoicing (D1)
b2g_phase139.0 ***58.1 ***9.233.5 **28.9 **
b2g_rest_public−6.3−15.4−11.6 ***−3.7−99.3 ***
central_admin_full−10.2 ***−20.6 ***−11.3−13.0 **7.7 **
eu_b2b_authorisation−46.0 ***−60.9 ***12.9−39.2 *−8.7
mydata_go_live2.715.3 ***1.839.0 ***−9.9
vat_mydata_alignment−37.8 ***−38.9 ***10.6−29.3 *56.8 ***
Note. Estimates in SVI points with NW-HAC L = 6 SEs; BH-adjusted stars as above. Significance markers are reported as follows: * p < 0.05 , ** p < 0.01 , *** p < 0.001 .
Table A2. Log-scale sensitivity: Δ6 as percentage change (HAC = 6). Six-month effects expressed on the log scale (approximate percentage change) for comparability across series.
Table A2. Log-scale sensitivity: Δ6 as percentage change (HAC = 6). Six-month effects expressed on the log scale (approximate percentage change) for comparability across series.
SeriesEventaade (A2) (log % Δ6)ααδε (A3) (log % Δ6)τιμολόγιο (C1) (log % Δ6)timologio (C2) (log % Δ6)Electronic Invoicing (D1) (log % Δ6)
0b2g_phase1−10.9%−4.2%346.4% **83.2%306.5% *
1b2g_rest_public−38.8%−10.6%243.8% **−41.4%−90.1% ***
2central_admin_full52.2%34.5%−30.0%−28.0%−29.7%
3eu_b2b_authorisation−19.5%2.2%87.5%26.9%137.9%
4mydata_go_live−51.0%−53.4% *22.7%524.4% ***−34.8%
5vat_mydata_alignment12.8%−18.6%−84.1% **−53.7%111.9%
Note. Entries are percentage changes relative to pre-policy baseline. NW-HAC L = 6 SEs; BH-adjusted stars as above. Interpret as semi-elasticities at six months. Significance markers are reported as follows: * p < 0.05 , ** p < 0.01 , *** p < 0.001 .
Table A3. STL-deseasoned outcomes: Δ6 in SVI points (HAC = 6). Δ6 re-estimated after removing seasonal components using STL.
Table A3. STL-deseasoned outcomes: Δ6 in SVI points (HAC = 6). Δ6 re-estimated after removing seasonal components using STL.
SeriesEventaade (A2) (STL Δ6)ααδε (A3) (STL Δ6)τιμολόγιο (C1) (STL Δ6)timologio (C2) (STL Δ6)Electronic Invoicing (D1) (STL Δ6)
0b2g_phase18.1 ***5.850.0 ***4.538.5 *
1b2g_rest_public−6.242.2 ***25.1 ***−0.3−51.6 ***
2central_admin_full7.2 ***16.3 ***−17.6 ***7.91.8
3eu_b2b_authorisation−4.6−10.2 *5.719.2 ***−24.0 *
4mydata_go_live3.715.4 ***3. 3 ***38.1 ***4.6
5vat_mydata_alignment−4.7−25.8 ***−71.3 ***−16.033.6 *
Note. Outcome is seasonally adjusted SVI; estimates in points. NW-HAC L = 6 SEs; BH-adjusted stars as above. Results mirror baseline signs and significance. Significance markers are reported as follows: * p < 0.05 , ** p < 0.01 , *** p < 0.001 .
Table A4. Placebo events (−24 months): Δ6 in SVI points (HAC = 6). Falsification test shifting each event 24 months earlier.
Table A4. Placebo events (−24 months): Δ6 in SVI points (HAC = 6). Falsification test shifting each event 24 months earlier.
SeriesEventaade (A2)ααδε (A3)τιμολόγιο (C1)timologio (C2)Electronic Invoicing (D1)
0b2g_phase1−35.0 *−4.0−1.758.6 ***22.9
1b2g_rest_public−42.3 ***−98.8 ***−8.821.4 **49.9 ***
2central_admin_full15.57.53.02.1−13.3
3eu_b2b_authorisation23.4 ***10.214.8 *2.36.9
4mydata_go_live−2.14.9−6.1 *−13.0 **10.7 ***
5vat_mydata_alignment90.0 ***71.3 ***12.9−54.4 ***−55.5 **
Note. Estimates in SVI points with NW-HAC L = 6 SEs; BH-adjusted stars as above. Lack of coherent event family patterns and frequent sign reversals support the validity of the main timing results. Significance markers are reported as follows: * p < 0.05 , ** p < 0.01 , *** p < 0.001 .
Table A5. Rolling-origin cross-validation estimate (MAE, RMSE) by horizon, model, and series.
Table A5. Rolling-origin cross-validation estimate (MAE, RMSE) by horizon, model, and series.
SeriesSeriesModelhMAERMSEn_splits
6Electronic invoicing (D1)SNAIVE (12)16.3333336.33333312
0Electronic invoicing (D1)OLS + events19.3010919.30109112
3Electronic invoicing (D1)SARIMAX + events117.50705417.50705412
1Electronic invoicing (D1)OLS + events27.7889508.54354212
7Electronic invoicing (D1)SNAIVE (12)28.1250009.50640412
4Electronic invoicing (D1)SARIMAX + events218.83214319.84036512
2Electronic invoicing (D1)OLS + events39.08429410.25819412
8Electronic invoicing (D1)SNAIVE (12)310.41666712.54067112
5Electronic invoicing (D1)SARIMAX + events320.04178421.42304412
9aade (A2)OLS + events13.8347583.83475812
15aade (A2)SNAIVE (12)16.4166676.41666712
12aade (A2)SARIMAX + events119.41650519.41650512
10aade (A2)OLS + events26.3793897.44120212
16aade (A2)SNAIVE (12)27.8125008.40369512
13aade (A2)SARIMAX + events219.97788420.91801712
11aade (A2)OLS + events36.3619707.52868912
17aade (A2)SNAIVE (12)37.6250008.80617912
14aade (A2)SARIMAX + events320.22552421.43915512
18timologio (C2)OLS + events121.35201921.35201912
24timologio (C2)SNAIVE (12)121.66666721.66666712
21timologio (C2)SARIMAX + events128.92803628.92803612
19timologio (C2)OLS + events216.21109317.56967012
25timologio (C2)SNAIVE (12)218.83333320.24706112
22timologio (C2)SARIMAX + events230.35180432.35783912
20timologio (C2)OLS + events314.34741416.47815512
26timologio (C2)SNAIVE (12)316.94444418.99347012
23timologio (C2)SARIMAX + events329.06711031.41450112
27ααδε (A3)OLS + events16.0350686.03506812
33ααδε (A3)SNAIVE (12)111.95833311.95833312
30ααδε (A3)SARIMAX + events132.57639332.57639312
28ααδε (A3)OLS + events28.5851379.41374512
34ααδε (A3)SNAIVE (12)212.10416712.66220212
31ααδε (A3)SARIMAX + events233.11434133.75248112
29ααδε (A3)OLS + events38.7192379.83056512
35ααδε (A3)SNAIVE (12)312.76388913.67155412
32ααδε (A3)SARIMAX + events334.51518435.28224512
36τιμολόγιο (C1)OLS + events17.8801467.88014612
42τιμολόγιο (C1)SNAIVE (12)18.2500008.25000012
39τιμολόγιο (C1)SARIMAX + events114.24962614.24962612
43τιμολόγιο (C1)SNAIVE (12)26.8333337.65665012
37τιμολόγιο (C1)OLS + events27.1348037.56064812
40τιμολόγιο (C1)SARIMAX + events214.62616415.53917612
44τιμολόγιο (C1)SNAIVE (12)35.4444446.61836512
38τιμολόγιο (C1)OLS + events36.0726136.72327912
41τιμολόγιο (C1)SARIMAX + events313.12923614.15096612
Note. Errors are averages across splits; lower is better. SVI is on Google’s 0–100 scale.
Table A6. Series × horizon accuracy winners with percent MAE improvement over SNAIVE (12).
Table A6. Series × horizon accuracy winners with percent MAE improvement over SNAIVE (12).
SeriesEventsModelhMAERMSEn_splits
0Electronic invoicing (D1)SNAIVE (12)16.3333336.33333312
1Electronic invoicing (D1)OLS + events27.7889508.54354212
2Electronic invoicing (D1)OLS + events39.08429410.25819412
3aade (A2)OLS + events13.8347583.83475812
4aade (A2)OLS + events26.3793897.44120212
5aade (A2)OLS + events36.3619707.52868912
6timologio (C2)OLS + events121.35201921.35201912
7timologio (C2)OLS + events216.21109317.56967012
8timologio (C2)OLS + events314.34741416.47815512
9ααδε (A3)OLS + events16.0350686.03506812
10ααδε (A3)OLS + events28.5851379.41374512
11ααδε (A3)OLS + events38.7192379.83056512
12τιμολόγιο (C1)OLS + events17.8801467.88014612
13τιμολόγιο (C1)SNAIVE (12)26.8333337.65665012
14τιμολόγιο (C1)SNAIVE (12)35.4444446.61836512
Note. Positive percentages indicate lower MAE than SNAIVE (12).
Table A7. Cross-validation summary relative to the SNAIVE (12) baseline (MAE_base and % improvement).
Table A7. Cross-validation summary relative to the SNAIVE (12) baseline (MAE_base and % improvement).
SeriesEventsModelhMAERMSEn_splitsMAE_baseMAE_improv_%
0Electronic invoicing (D1)SNAIVE (12)16.3333336.333333126.3333330.000000
1Electronic invoicing (D1)OLS + events19.3010919.301091126.333333−46.859334
2Electronic invoicing (D1)SARIMAX + events117.50705417.507054126.333333−176.427175
3Electronic invoicing (D1)OLS + events27.7889508.543542128.1250004.135998
4Electronic invoicing (D1)SNAIVE (12)28.1250009.506404128.1250000.000000
5Electronic invoicing (D1)SARIMAX + events218.83214319.840365128.125000−131.780225
6Electronic invoicing (D1)OLS + events39.08429410.2581941210.41666712.790781
7Electronic invoicing (D1)SNAIVE (12)310.41666712.5406711210.4166670.000000
8Electronic invoicing (D1)SARIMAX + events320.04178421.4230441210.416667−92.401126
9aade (A2)OLS + events13.8347583.834758126.41666740.237541
10aade (A2)SNAIVE (12)16.4166676.416667126.4166670.000000
11aade (A2)SARIMAX + events119.41650519.416505126.416667−202.594890
12aade (A2)OLS + events26.3793897.441202127.81250018.343818
13aade (A2)SNAIVE (12)27.8125008.403695127.8125000.000000
14aade (A2)SARIMAX + events219.97788420.918017127.812500−155.716914
15aade (A2)OLS + events36.3619707.528689127.62500016.564331
16aade (A2)SNAIVE (12)37.6250008.806179127.6250000.000000
17aade (A2)SARIMAX + events320.22552421.439155127.625000−165.252777
18timologio (C2)OLS + events121.35201921.3520191221.6666671.452219
19timologio (C2)SNAIVE (12)121.66666721.6666671221.6666670.000000
20timologio (C2)SARIMAX + events128.92803628.9280361221.666667−33.514011
21timologio (C2)OLS + events216.21109317.5696701218.83333313.923400
22timologio (C2)SNAIVE (12)218.83333320.2470611218.8333330.000000
23timologio (C2)SARIMAX + events230.35180432.3578391218.833333−61.160021
24timologio (C2)OLS + events314.34741416.4781551216.94444415.326736
25timologio (C2)SNAIVE (12)316.94444418.9934701216.9444440.000000
26timologio (C2)SARIMAX + events329.06711031.4145011216.944444−71.543601
27ααδε (A3)OLS + events16.0350686.0350681211.95833349.532535
28ααδε (A3)SNAIVE (12)111.95833311.9583331211.9583330.000000
29ααδε (A3)SARIMAX + events132.57639332.5763931211.958333−172.415826
30ααδε (A3)OLS + events28.5851379.4137451212.10416729.072875
31ααδε (A3)SNAIVE (12)212.10416712.6622021212.1041670.000000
32ααδε (A3)SARIMAX + events233.11434133.7524811212.104167−173.578033
33ααδε (A3)OLS + events38.7192379.8305651212.76388931.688238
34ααδε (A3)SNAIVE (12)312.76388913.6715541212.7638890.000000
35ααδε (A3)SARIMAX + events334.51518435.2822451212.763889−170.412762
36τιμολόγιο (C1)OLS + events17.8801467.880146128.2500004.483073
37τιμολόγιο (C1)SNAIVE (12)18.2500008.250000128.2500000.000000
38τιμολόγιο (C1)SARIMAX + events114.24962614.249626128.250000−72.722739
39τιμολόγιο (C1)SNAIVE (12)26.8333337.656650126.8333330.000000
40τιμολόγιο (C1)OLS + events27.1348037.560648126.833333−4.411757
41τιμολόγιο (C1)SARIMAX + events214.62616415.539176126.833333−114.041425
42τιμολόγιο (C1)SNAIVE (12)35.4444446.618365125.4444440.000000
43τιμολόγιο (C1)OLS + events36.0726136.723279125.444444−11.537782
44τιμολόγιο (C1)SARIMAX + events313.12923614.150966125.444444−141.149230
Note. n_splits = 12 for all cells; metrics computed on held-out folds from the rolling-origin procedure.
Table A8. Rolling-origin cross-validation results.
Table A8. Rolling-origin cross-validation results.
SeriesEventsModelhMAERMSEsMAPEMASEPI95_covn_splits
0Electronic invoicing (D1)BLEND (OLS, SNAIVE)17.7897967.78979688.6999121.0922610.73333315
1Electronic invoicing (D1)BLEND (OLS, SNAIVE)27.9516718.61938264.4022441.1180630.73333315
2Electronic invoicing (D1)BLEND (OLS, SNAIVE)39.98488911.62520070.9195351.3781350.68888915
3aade (A2)BLEND (AR1, SNAIVE)14.4633884.46338821.3013190.4623400.93333315
4aade (A2)BLEND (AR1, SNAIVE)26.3713567.33484124.4458570.6616810.83333315
5aade (A2)BLEND (AR1, SNAIVE)36.2281737.39389024.2543380.6474330.82222215
6timologio (C2)BLEND (OLS, SNAIVE)115.66244715.66244767.2731604.1329440.60000015
7timologio (C2)BLEND (OLS, SNAIVE)212.51101614.17888158.0881253.0353600.66666715
8timologio (C2)BLEND (OLS, SNAIVE)310.88342012.96552055.5454022.5025640.66666715
9ααδε (A3)OLS + events + AR117.1578167.15781619.0576490.5021780.93333315
10ααδε (A3)BLEND (AR1, SNAIVE)28.7158779.35930522.9311400.6254360.86666715
11ααδε (A3)BLEND (AR1, SNAIVE)38.8350199.88641922.7749810.6483280.84444415
12τιμολόγιο (C1)BLEND (OLS, SNAIVE)17.5023837.50238352.2961832.2002470.60000015
13τιμολόγιο (C1)BLEND (OLS, SNAIVE)26.7020387.40986745.7584271.8557300.56666715
14τιμολόγιο (C1)SNAIVE (12)35.4888896.63290741.8887911.5423990.66666715
Note. For each series and horizon, the table reports the winning specification and its accuracy/uncertainty metrics: MAE, RMSE, sMAPE, MASE, and empirical 95% predictive-interval coverage (PI95_cov). Blends dominate most cells (e.g., BLEND (AR1, SNAIVE) for A2 and A3; BLEND (OLS, SNAIVE) for C2 and C1 at short horizons), while SNAIVE (12) remains the winner only for C1 at h = 3. These outcomes mirror the improvement heatmap and the MAE bar patterns.

References

  1. Alexopoulos, T.A.; Thompson, H. A macroeconomic simulation for Greece in the wake of its government debt crisis. Econ. Change Restruct. 2021, 54, 699–716. [Google Scholar] [CrossRef]
  2. Boikos, S.; Makantasi, E.; Panagiotidis, T. Macroeconomic Uncertainty Indices for European Countries. Notas Econ. 2023, 2023, 7–56. [Google Scholar] [CrossRef] [PubMed]
  3. Box, G.E.P.; Tiao, G.C. Intervention analysis with applications to economic and environmental problems. J. Am. Stat. Assoc. 1975, 70, 70–79. [Google Scholar] [CrossRef]
  4. Choi, H.; Varian, H. Predicting the Present with Google Trends. Econ. Rec. 2012, 88, 2–9. [Google Scholar] [CrossRef]
  5. Cohen, B.C. Press and Foreign Policy; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar] [CrossRef]
  6. Da, Z.; Engelberg, J.; Gao, P. In Search of Attention. J. Financ. 2011, 66, 1461–1499. [Google Scholar] [CrossRef]
  7. Dokas, I.; Oikonomou, G.; Panagiotidis, M.; Spyromitros, E. Macroeconomic and Uncertainty Shocks’ Effects on Energy Prices: A Comprehensive Literature Review. Energies 2023, 16, 1491. [Google Scholar] [CrossRef]
  8. E-Invoicing Compliance in Greece|Pagero. Available online: https://www.pagero.com/compliance/regulatory-updates/greece (accessed on 27 August 2025).
  9. Erokhin, D.; Komendantova, N. Analyzing Public Interest in Geohazards Using Google Trends Data. Geosciences 2024, 14, 266. [Google Scholar] [CrossRef]
  10. Ferguson, D.; Meyer, F.G. Probability density estimation for sets of large graphs with respect to spectral information using stochastic block models. arXiv 2022, arXiv:2207.02168v1. [Google Scholar] [CrossRef]
  11. Gelman, A. Scaling regression inputs by dividing by two standard deviations. Stat. Med. 2008, 27, 2865–2873. [Google Scholar] [CrossRef]
  12. Ghosh, A.; E-Roub, F.; Krishnan, N.C.; Choudhury, S.; Basu, A. Can google trends search inform us about the population response and public health impact of abrupt change in alcohol policy?—A case study from India during the COVID-19 pandemic. Int. J. Drug Policy 2021, 87, 102984. [Google Scholar] [CrossRef] [PubMed]
  13. Ginsberg, J.; Mohebbi, M.H.; Patel, R.S.; Brammer, L.; Smolinski, M.S.; Brilliant, L. Detecting influenza epidemics using search engine query data. Nature 2009, 457, 1012–1014. [Google Scholar] [CrossRef] [PubMed]
  14. Goel, S.; Hofman, J.M.; Lahaie, S.; Pennock, D.M.; Watts, D.J. Predicting consumer behavior with web search. Proc. Natl. Acad. Sci. USA 2010, 107, 17486–17490. [Google Scholar] [CrossRef] [PubMed]
  15. Greece: B2B and B2G Electronic Invoicing via MyData|EDICOM Global. Available online: https://edicomgroup.com/blog/greece-mandatory-electronic-invoice?utm_source=chatgpt.com (accessed on 27 August 2025).
  16. Greece: Formal EU Approval for B2B E-Invoicing Mandate Published|Sovos. Available online: https://sovos.com/regulatory-updates/vat/greece-formal-eu-approval-for-b2b-e-invoicing-mandate-published/?utm_source=chatgpt.com (accessed on 27 August 2025).
  17. Harvey, A.C. Forecasting, Structural Time Series Models and the Kalman Filter; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar] [CrossRef]
  18. Heinemann, M.; Stiller, W. Digitalization and cross-border tax fraud: Evidence from e-invoicing in Italy. Int. Tax Public Financ. 2025, 32, 195–237. [Google Scholar] [CrossRef]
  19. Hölzl, J.; Keusch, F.; Sajons, C. The (mis)use of Google Trends data in the social sciences—A systematic review, critique, and recommendations. Soc. Sci. Res. 2025, 126, 103099. [Google Scholar] [CrossRef]
  20. Hyytinen, A.; Tuimala, J.; Hammar, M. Enhancing the adoption of digital public services: Evidence from a large-scale field experiment. Gov. Inf. Q. 2022, 39, 101687. [Google Scholar] [CrossRef]
  21. Implementing Decision—EU—2025/502—EN—EUR-Lex. Available online: https://eur-lex.europa.eu/eli/dec_impl/2025/502/oj/eng?utm_source=chatgpt.com (accessed on 27 August 2025).
  22. Linnell, K.; Fudolig, M.; Schwartz, A.; Ricketts, T.H.; O’Neil-Dunne, J.P.M.; Dodds, P.S.; Danforth, C.M. Spatial changes in park visitation at the onset of the pandemic. PLoS Glob. Public Health 2022, 2, e0000766. [Google Scholar] [CrossRef]
  23. Makridakis, S.; Spiliotis, E.; Assimakopoulos, V. The M4 Competition: Results, findings, conclusion and way forward. Int. J. Forecast. 2018, 34, 802–808. [Google Scholar] [CrossRef]
  24. Mccombs, M.E.; Shaw, D.L. The Agenda-Setting function of mass media. Public Opin. Q. 1972, 36, 176–187. [Google Scholar] [CrossRef]
  25. New Reporting System for Greek VAT—Taxand. Available online: https://www.taxand.com/our-thinking/insights/new-reporting-system-for-greek-vat/?utm_source=chatgpt.com (accessed on 27 August 2025).
  26. Fiscal Solutions. New Rules for Data Transmission to the myDATA Platform in Greece Have Recently Been Published. Available online: https://www.fiscal-requirements.com/news/2702 (accessed on 27 August 2025).
  27. Ozier, D.; Rafiq, T.; de Souza, R.J.; Singh, S.M. Use of Sacubitril/Valsartan Prior to Primary Prevention Implantable Cardioverter Defibrillator Implantation. CJC Open 2023, 5, 93–98. [Google Scholar] [CrossRef]
  28. Papagianni, E.; Evgenidis, A.; Tsagkanos, A.; Megalooikonomou, V. Tourism Demand in the Face of Geopolitical Risk: Insights From a Cross-Country Analysis. J. Travel Res. 2024, 63, 2094–2119. [Google Scholar] [CrossRef]
  29. Reigl, N. Noise shocks and business cycle fluctuations in three major European Economies. Empir. Econ. 2023, 64, 603–657. [Google Scholar] [CrossRef]
  30. Safitri, K. Tax Policy Innovations for Enhancing MSMEs Compliance and Economic Resilience. Int. J. Bus. Appl. Econ. 2025, 4, 769–784. [Google Scholar] [CrossRef]
  31. Shen, L.; Sun, M.; Song, S.; Hu, Q.; Wang, N.; Ou, G.; Guo, Z.; Du, J.; Shao, Z.; Bai, Y.; et al. The impact of anti-COVID-19 nonpharmaceutical interventions on hand, foot, and mouth disease—A spatiotemporal perspective in Xi’an, northwestern China. J. Med. Virol. 2022, 94, 3121–3132. [Google Scholar] [CrossRef]
  32. Simionescu, M.; Schneider, N. Monetary shocks and production network in the G7 countries. J. Econ. Struct. 2023, 12, 20. [Google Scholar] [CrossRef]
  33. Tsamis, G.; Evangelos, G.; Papakostas, A.; Vassiliou, G.; Grafanakis, M.; Garefalakis, A.; Vassalos, M.; Mylona, A.; Papadakis, N. Cost-Effective Design, Content Management System Implementation and Artificial Intelligence Support of Greek Government AADE, myDATA Web Service for Generic Government Infrastructure, a Complete Analysis. Algorithms 2025, 18, 339. [Google Scholar] [CrossRef]
  34. Tsitouras, A.; Papapanagos, H. Factors Influencing Income Inequality and Inclusive Growth in Greece: A Long-Run and Short-Run Analysis. J. Knowl. Econ. 2025, 17, 2889–2919. [Google Scholar] [CrossRef]
  35. Tu, T.; Chhatralia, K.; Maguire, K.; Tipping, S. HM Revenue and Customs Research Report 480 Making Tax Digital for Business: Survey of Small Businesses and Landlords; Research Report for HMRC; HM Revenue & Customs: London, UK, 2017.
  36. Downs, A. Up and Down with Ecology: The “Issue-Attention Cycle”. In Agenda Setting; Routledge: Oxfordshire, UK, 2016; pp. 27–33. Available online: https://www.taylorfrancis.com/chapters/edit/10.4324/9781315538389-4/ecology-issue-attention-cycle-anthony-downs (accessed on 27 August 2025).
  37. Yu, C.; Li, Y. Digitalization of tax collection and enterprises’ social security compliance. Int. Tax Public Financ. 2025, 32, 1213–1252. [Google Scholar] [CrossRef]
Figure 1. myData policy timeline.
Figure 1. myData policy timeline.
Accountaudit 02 00006 g001
Figure 2. Event impacts at six months (Δ6) with HAC-robust 95% confidence intervals. Dumbbell plot of the estimated relative change in Google Trends search volume index (SVI; 0–100 scale) six months following each policy event (vertical line at 0 = no change). Points represent each estimate per query series (color-coded), and horizontal bars represent HAC-robust 95% CIs. Positive values represent greater search interest than the counterfactual; negative values represent decreased interest.
Figure 2. Event impacts at six months (Δ6) with HAC-robust 95% confidence intervals. Dumbbell plot of the estimated relative change in Google Trends search volume index (SVI; 0–100 scale) six months following each policy event (vertical line at 0 = no change). Points represent each estimate per query series (color-coded), and horizontal bars represent HAC-robust 95% CIs. Positive values represent greater search interest than the counterfactual; negative values represent decreased interest.
Accountaudit 02 00006 g002
Figure 3. Event impacts at six months (Δ6) by family (95% HAC CIs). Panel plots repeating the Δ6 estimates from Figure 1 but grouped by construct: A (platform queries), C (application queries), and D (ecosystem query). Within each panel, points and 95% HAC CIs are shown for the relevant series only. The vertical line marks zero impact.
Figure 3. Event impacts at six months (Δ6) by family (95% HAC CIs). Panel plots repeating the Δ6 estimates from Figure 1 but grouped by construct: A (platform queries), C (application queries), and D (ecosystem query). Within each panel, points and 95% HAC CIs are shown for the relevant series only. The vertical line marks zero impact.
Accountaudit 02 00006 g003
Figure 4. Rolling-origin CV MAE by model and horizon (faceted by series). Mean absolute error (MAE; Google Trends 0–100 scale) for SNAIVE (12) and OLS + events at horizons h = 1, 2, 3 months, computed from blocked rolling-origin cross-validation. Lower bars indicate better accuracy.
Figure 4. Rolling-origin CV MAE by model and horizon (faceted by series). Mean absolute error (MAE; Google Trends 0–100 scale) for SNAIVE (12) and OLS + events at horizons h = 1, 2, 3 months, computed from blocked rolling-origin cross-validation. Lower bars indicate better accuracy.
Accountaudit 02 00006 g004
Figure 5. Heatmap of relative MAE change by series and horizon, computed as 100 × 1 MAE OLS   +   events / MAE SNAIVE ( 12 ) . Green (positive) values show OLS + events outperforming the seasonal-naïve baseline; red (negative) values show the reverse.
Figure 5. Heatmap of relative MAE change by series and horizon, computed as 100 × 1 MAE OLS   +   events / MAE SNAIVE ( 12 ) . Green (positive) values show OLS + events outperforming the seasonal-naïve baseline; red (negative) values show the reverse.
Accountaudit 02 00006 g005
Figure 6. ααδε (A3): Example of three-step nowcast with 95% prediction intervals (OLS + events). SVI is on Google’s 0–100 scale. PIs are based on out-of-sample residual variance from the rolling-origin setup used in RQ2.
Figure 6. ααδε (A3): Example of three-step nowcast with 95% prediction intervals (OLS + events). SVI is on Google’s 0–100 scale. PIs are based on out-of-sample residual variance from the rolling-origin setup used in RQ2.
Accountaudit 02 00006 g006
Figure 7. Rolling-origin backtest paths (h = 1) by series. Out-of-sample one-step-ahead predictions for the blocked rolling-origin design (origins from 2018-01 in 6-month increments; 48-month maximum training window) are plotted for each target series.
Figure 7. Rolling-origin backtest paths (h = 1) by series. Out-of-sample one-step-ahead predictions for the blocked rolling-origin design (origins from 2018-01 in 6-month increments; 48-month maximum training window) are plotted for each target series.
Accountaudit 02 00006 g007
Figure 8. Best model vs. SNAIVE (12)—percentage MAE improvement (rolling CV). Heatmap shows the percentage reduction in MAE of the best model at each series–horizon cell relative to SNAIVE (12) under rolling-origin CV (origins from 2018-01, step = 6 months, max train = 48 months; h = 1–3). Green shades indicate improvements; values are labeled as 100 × 1 MAE best MAE SNAIVE   ( 12 ) . “Best” is chosen among OLS + events, AR (1) variants, SARIMAX + events, and equal-weight blends (e.g., BLEND (OLS, SNAIVE), BLEND (AR1, SNAIVE)).
Figure 8. Best model vs. SNAIVE (12)—percentage MAE improvement (rolling CV). Heatmap shows the percentage reduction in MAE of the best model at each series–horizon cell relative to SNAIVE (12) under rolling-origin CV (origins from 2018-01, step = 6 months, max train = 48 months; h = 1–3). Green shades indicate improvements; values are labeled as 100 × 1 MAE best MAE SNAIVE   ( 12 ) . “Best” is chosen among OLS + events, AR (1) variants, SARIMAX + events, and equal-weight blends (e.g., BLEND (OLS, SNAIVE), BLEND (AR1, SNAIVE)).
Accountaudit 02 00006 g008
Figure 9. Example for aade (A2): h = 1–3-month forecast—BLEND (OLS + AR1, SNAIVE) with prediction cones. The orange line shows the 1–3-month point forecast from the equal-weight blend of OLS + events + AR (1) and SNAIVE (12); shaded bands indicate 50%, 80%, and 90% prediction intervals. The dashed vertical line marks the forecast start, and the dotted green line shows the SNAIVE (12) benchmark.
Figure 9. Example for aade (A2): h = 1–3-month forecast—BLEND (OLS + AR1, SNAIVE) with prediction cones. The orange line shows the 1–3-month point forecast from the equal-weight blend of OLS + events + AR (1) and SNAIVE (12); shaded bands indicate 50%, 80%, and 90% prediction intervals. The dashed vertical line marks the forecast start, and the dotted green line shows the SNAIVE (12) benchmark.
Accountaudit 02 00006 g009
Figure 10. Example of backtest path (h = 1): actual vs. OLS + events vs. SNAIVE (12) for ααδε (A3).
Figure 10. Example of backtest path (h = 1): actual vs. OLS + events vs. SNAIVE (12) for ααδε (A3).
Accountaudit 02 00006 g010
Table 1. Variables and definitions.
Table 1. Variables and definitions.
VariableSeries/Definition
mydata (A1)GT SVI for “mydata” (platform term)
aade (A2)GT SVI for “aade” (platform term; priority outcome)
ααδε (A3)GT SVI for “ααδε” (Greek script; platform; priority outcome)
ηλεκτρονικα βιβλια (B1)GT SVI for “ηλεκτρονικα βιβλια” (books/e-books; ancillary)
ηλεκτρονικά βιβλία (B2)GT SVI for “ηλεκτρονικά βιβλία” (diacritics variant; ancillary)
ηλεκτρονικα βιβλια ααδε (B3)GT SVI for “ηλεκτρονικα βιβλια ααδε” (ancillary)
τιμολόγιο (C1)GT SVI for “τιμολόγιο” (invoicing; app family; priority)
timologio (C2)GT SVI for “timologio” (Latin script variant; app; priority)
ηλεκτρονικό τιμολόγιο (C3)GT SVI for “ηλεκτρονικό τιμολόγιο” (app/feature; ancillary)
Electronic invoicing (topic) (D1)GT SVI topic for “Electronic invoicing” (ecosystem; priority)
e-invoicing (D2)GT SVI for “e-invoicing” (ecosystem; ancillary)
peppol (D3)GT SVI for “peppol” (standard; ecosystem; ancillary)
Composite attention indexz-score mean of the five priority series (A2, A3, C1, C2, D1); PCA (1) used in robustness
Policy events—step dummies S t ( e ) = 1   { t T e } for each preregistered milestone
Policy events—slope ramps P O S T t e = m a x   0 , t T e in months for each milestone (as preregistered)
Calendar dummies (m2, , m12)Month fixed effects with January omitted
quarter_endIndicator for calendar quarter end; excluded in main (collinearity), retained in robustness
Notes: All GT series are monthly SVI values in 0–100 units. Values of “<1” are recoded to 0.5. The main analysis window is 2016–latest; 2004+ is used for specified robustness checks.
Table 2. Six-month post-event change (Δ6) in search volume index (SVI) points by series and event (BH-adjusted significance).
Table 2. Six-month post-event change (Δ6) in search volume index (SVI) points by series and event (BH-adjusted significance).
SeriesEventElectronic Invoicing (D1)aade (A2)timologio (C2)ααδε (A3)τιμολόγιο (C1)
0b2g_phase128.83.223.8 **18.133.6 **
1b2g_rest_public−52.7 ***−22.28.35.229.7 ***
2central_admin_full1.910.4−3.46.4−10.4
3eu_b2b_authorisation14.2−6.7−1.0−4.48.2 *
4mydata_go_live4.63.238.4 ***15.1 *3.4 *
5vat_mydata_alignment46.4 **0.9−34.3 **−34.2−43.1 *
Note: Δ6 is the average change in SVI over months t = +1, …, +6 relative to the pre-event path. Stars denote BH-adjusted significance: p < 0.05 (*), p < 0.01 (**), p < 0.001 (***). Positive values indicate increases in search interest.
Table 3. Six-month post-event change expressed as percentage of each series’ 2019–2020 mean (context for magnitude).
Table 3. Six-month post-event change expressed as percentage of each series’ 2019–2020 mean (context for magnitude).
SeriesEventElectronic Invoicing (D1)aade (A2)timologio (C2)ααδε (A3)τιμολόγιο (C1)
0b2g_phase1198.2%19.5%312.5%66.6%363.0%
1b2g_rest_public−362.3%−134.3%108.3%19.1%321.3%
2central_admin_full13.0%63.0%−45.0%23.5%−112.7%
3eu_b2b_authorisation97.4%−40.6%−13.2%−16.0%88.6%
4mydata_go_live31.6%19.3%504.2%55.7%36.7%
5vat_mydata_alignment318.9%5.2%−449.4%−126.0%−466.2%
Note. Percentage values scale Δ6 by the series’ mean SVI during 2019–2020 to enable cross-series comparisons. Signs retain the direction of the Δ6 effect (e.g., −466% reflects a large decline relative to baseline).
Table 4. Baseline Δ6 (SVI points) by event and construct (HAC = 6). Six-month post-event changes in search interest across platform queries (A2/A3), application terms (C1/C2), and the ecosystem term (D1).
Table 4. Baseline Δ6 (SVI points) by event and construct (HAC = 6). Six-month post-event changes in search interest across platform queries (A2/A3), application terms (C1/C2), and the ecosystem term (D1).
Eventaade (A2)ααδε (A3)τιμολόγιο (C1)timologio (C2)Electronic Invoicing (D1)
mydata_go_live3.215.13.438.4 ***4.6
b2g_phase13.218.133.6 *23.8 *28.8
central_admin_full10.46.4−10.4−3.41.9
vat_mydata_alignment0.9−34.2−43.1−34.3 *46.4 *
b2g_rest_public−22.25.229.7 **8.3−52.7 ***
eu_b2b_authorisation−6.7−4.48.2−1.014.2
Note. Entries are Δ6 in SVI points (0–100). Positive = increase; negative = decrease. Newey–West HAC SEs with L = 6 months. Benjamini–Hochberg (BH)-adjusted results within construct: * p < 0.05, ** p < 0.01, *** p < 0.001.
Table 5. HAC bandwidth comparison for Δ6: L = 12 vs. L = 6. Sensitivity of Δ6 estimates to the HAC bandwidth choice.
Table 5. HAC bandwidth comparison for Δ6: L = 12 vs. L = 6. Sensitivity of Δ6 estimates to the HAC bandwidth choice.
Eventaade (A2)_HAC12ααδε (A3)_HAC12τιμολόγιο (C1)_HAC12timologio (C2)_HAC12Electronic Invoicing (D1)_HAC12aade (A2)_HAC6ααδε (A3)_HAC6τιμολόγιο (C1)_HAC6timologio (C2)_HAC6Electronic Invoicing (D1)_HAC6
mydata_go_live3.215.1 **3.438.4 ***4.63.215.13.438.4 ***4.6
b2g_phase13.218.133.6 ***23.8 *28.83.218.133.6 *23.8 *28.8
central_admin_full10.46.4−10.4−3.41.910.46.4−10.4−3.41.9
vat_mydata_alignment0.9−34.2−43.1 **−34.3 **46.4 *0.9−34.2−43.1−34.3 *46.4 *
b2g_rest_public−22.25.229.7 ***8.3−52.7 ***−22.25.229.7 **8.3−52.7 ***
eu_b2b_authorisation−6.7−4.48.2−1.014.2−6.7−4.48.2−1.014.2
Note. Left block uses NW-HAC L = 12; right block shows L = 6 (baseline). BH-adjusted significance markers are reported as follows: * p < 0.05 , ** p < 0.01 , *** p < 0.001 . Substantive conclusions are unchanged across bandwidth choices.
Table 6. Event timing sensitivity: +1-month lag (Δ6, HAC = 6). Δ6 re-estimated after shifting each event forward by +1 month to allow for implementation/awareness delays.
Table 6. Event timing sensitivity: +1-month lag (Δ6, HAC = 6). Δ6 re-estimated after shifting each event forward by +1 month to allow for implementation/awareness delays.
Eventaade (A2)ααδε (A3)τιμολόγιο (C1)timologio (C2)Electronic Invoicing (D1)
b2g_phase111.924.2 ***44.8 ***32.2 ***36.9 **
b2g_rest_public8.226.8 **−8.116.1−65.5 ***
central_admin_full11.39.8 *−29.4 ***−5.9−0.9
eu_b2b_authorisation−28.6 **−32.0 ***10.4 *−10.126.7
mydata_go_live3.115.8 ***2.540.1 ***−3.5
vat_mydata_alignment−19.5−37.9 ***−39.2 ***−47.9 ***33.2 *
Note. Estimates in SVI points with NW-HAC L = 6 SEs; BH-adjusted stars as above. Significance markers are reported as follows: * p < 0.05 , ** p < 0.01 , *** p < 0.001 .
Table 7. Cross-validated winners by series and horizon with percentage MAE improvement vs. SNAIVE (12).
Table 7. Cross-validated winners by series and horizon with percentage MAE improvement vs. SNAIVE (12).
hh = 1h = 2h = 3
Electronic invoicing (D1)SNAIVE (12)—MAE 6.33 (+0.0% vs. SNAIVE (12))OLS + events—MAE 7.79 (+4.1% vs. SNAIVE (12))OLS + events—MAE 9.08 (+12.8% vs. SNAIVE (12))
aade (A2)OLS + events—MAE 3.83 (+40.2% vs. SNAIVE (12))OLS + events—MAE 6.38 (+18.3% vs. SNAIVE (12))OLS + events—MAE 6.36 (+16.6% vs. SNAIVE (12))
timologio (C2)OLS + events—MAE 21.35 (+1.5% vs. SNAIVE (12))OLS + events—MAE 16.21 (+13.9% vs. SNAIVE (12))OLS + events—MAE 14.35 (+15.3% vs. SNAIVE (12))
ααδε (A3)OLS + events—MAE 6.04 (+49.5% vs. SNAIVE (12))OLS + events—MAE 8.59 (+29.1% vs. SNAIVE (12))OLS + events—MAE 8.72 (+31.7% vs. SNAIVE (12))
τιμολόγιο (C1)OLS + events—MAE 7.88 (+4.5% vs. SNAIVE (12))SNAIVE (12)—MAE 6.83 (+0.0% vs. SNAIVE (12))SNAIVE (12)—MAE 5.44 (+0.0% vs. SNAIVE (12))
Note. For each series and forecast horizon (h = 1, 2, 3 months), the table reports the winner (lowest mean absolute error, MAE) under blocked rolling-origin cross-validation, the winner’s MAE, and the winner’s percentage improvement relative to SNAIVE (12). Improvements are computed as 100 × 1 MAE w i n n e r / MAE SNAIVE ( 12 ) .
Table 8. Diebold–Mariano tests: OLS + events vs. SNAIVE (12) by series and horizon (MAE loss).
Table 8. Diebold–Mariano tests: OLS + events vs. SNAIVE (12) by series and horizon (MAE loss).
SeriesEventhLossnDM t (p)Winner
12Electronic invoicing (D1)1MAE121.13 (p = 0.282)SNAIVE (12)
13Electronic invoicing (D1)2MAE12−0.16 (p = 0.878)OLS + events
14Electronic invoicing (D1)3MAE12−0.57 (p = 0.582)OLS + events
0aade (A2)1MAE12−2.09 (p = 0.061) *OLS + events
1aade (A2)2MAE12−1.04 (p = 0.322)OLS + events
2aade (A2)3MAE12−1.12 (p = 0.285)OLS + events
9timologio (C2)1MAE12−0.07 (p = 0.949)OLS + events
10timologio (C2)2MAE12−0.52 (p = 0.611)OLS + events
11timologio (C2)3MAE12−0.58 (p = 0.571)OLS + events
3ααδε (A3)1MAE12−2.41 (p = 0.034) **OLS + events
4ααδε (A3)2MAE12−1.33 (p = 0.212)OLS + events
5ααδε (A3)3MAE12−1.38 (p = 0.196)OLS + events
6τιμολόγιο (C1)1MAE12−0.21 (p = 0.841)OLS + events
7τιμολόγιο (C1)2MAE120.25 (p = 0.805)SNAIVE (12)
8τιμολόγιο (C1)3MAE120.68 (p = 0.511)SNAIVE (12)
Note. Tests use n = 12 forecast origins per horizon. Significance markers: p < 0.10, * p < 0.05, ** p < 0.01 (unadjusted). With this sample size, only A3 at h = 1 shows a clear advantage for OLS + events (p = 0.034), and A2 at h = 1 is marginal (p = 0.061). All other contrasts are not statistically significant.
Table 9. Event-by-series movement grid (Δ6 classification at 6 months).
Table 9. Event-by-series movement grid (Δ6 classification at 6 months).
SeriesEventaade (A2)ααδε (A3)τιμολόγιο (C1)timologio (C2)Electronic Invoicing (D1)
0mydata_go_live▲ (L, S)▲ (L, S)▲ (L, S)○ (S)
1b2g_phase1○ (L, S)▲ (L)▲ (S)
2central_admin_full
3vat_mydata_alignment▼ (S)▼ (S)▲ (S)
4b2g_rest_public▼ (L)▲ (L, S)▼ (S)
5eu_b2b_authorisation○ (L, S)○ (L, S)▲ (S)
Note. Categorization of medium-term effects (Δ6) per event and search series. ▲ = positive Δ6 (BH-FDR p ≤ 0.05); ▼ = negative Δ6 (BH-FDR p ≤ 0.05); ○ = not significant at BH-FDR 5%. Parentheses show significant component(s) despite marginal Δ6: (L) level shift β_S; (S) post-event slope β_P. Estimates on the pre-registered event study: month OLS with month fixed effects, centered trend, COVID pulses, and Newey–West HAC (6) standard errors; Δ6 = β_S + 6β_P. Event dates are deterministic (myDATA go-live, B2G phase 1, central administration full, VAT/myDATA alignment, B2G rest-of-public, EU B2B authorization). Family-wise inferences pool across A2–A3 (platform), C1–C2 (apps), and D1 (ecosystem).
Table 10. Top events by |Δ6| within each outcome series (BH-FDR on Δ6).
Table 10. Top events by |Δ6| within each outcome series (BH-FDR on Δ6).
SeriesRankEventΔ6 (SVI Points)
aade (A2)1b2g_rest_public−22.2 **
2central_admin_full10.4
3eu_b2b_authorisation−6.7
4b2g_phase13.2
5mydata_go_live3.2
6vat_mydata_alignment0.9
ααδε (A3)1vat_mydata_alignment−34.2
2b2g_phase118.1
3mydata_go_live15.1 ***
4central_admin_full6.4
5b2g_rest_public5.2
6eu_b2b_authorisation−4.4
τιμολόγιο (C1)1vat_mydata_alignment−43.1 **
2b2g_phase133.6 ***
3b2g_rest_public29.7 ***
4central_admin_full−10.4
5eu_b2b_authorisation8.2 **
6mydata_go_live3.4 **
timologio (C2)1mydata_go_live38.4 ***
2vat_mydata_alignment−34.3 ***
3b2g_phase123.8 ***
4b2g_rest_public8.3
5central_admin_full−3.4
6eu_b2b_authorisation−1.0
Electronic invoicing (D1)1b2g_rest_public−52.7 ***
2vat_mydata_alignment46.4 ***
3b2g_phase128.8
4eu_b2b_authorisation14.2
5mydata_go_live4.6
6central_admin_full1.9
Notes. Δ6 is the six-month post-event effect (SVI points, 0–100 scale). Ranks are by absolute |Δ6| within series (1 = largest magnitude). Stars reflect BH-FDR-adjusted significance within series across events: * p < 0.05, ** p < 0.01, *** p < 0.001. Negative values indicate decreases in search interest relative to the counterfactual path.
Table 11. Composite attention index (z-mean across A2/A3/C1/C2/D1): Δ6 tests.
Table 11. Composite attention index (z-mean across A2/A3/C1/C2/D1): Δ6 tests.
Eventdelta_6p_delta_6
0mydata_go_live0.5208003.924432 × 10−10
1b2g_phase11.4886481.967194 × 10−3
2central_admin_full−0.1310135.716091 × 10−1
3vat_mydata_alignment−1.3183923.499174 × 10−2
4b2g_rest_public0.2632303.607021 × 10−1
5eu_b2b_authorisation0.2221214.848606 × 10−1
Note. Composite constructed as the mean of z-scored series (each standardized over the analysis window). Entries report Δ6 and HAC (6) p-values; BH-FDR applied across events. The composite corroborates broad movement at myDATA go-live and B2G phase 1, no effect at central administration full, and a modest negative effect at VAT–myDATA alignment.
Table 12. Selected backtest checkpoints for ααδε (A3), h = 1 month—actual vs. OLS + events vs. SNAIVE (12).
Table 12. Selected backtest checkpoints for ααδε (A3), h = 1 month—actual vs. OLS + events vs. SNAIVE (12).
MonthActualOLS + EventsSNAIVE (12)
22020-02-0120.022.00000011.0
32020-08-0129.027.56481515.0
42021-02-0140.031.86342620.0
52021-08-0127.033.01264029.0
62022-02-0165.063.33978940.0
72022-08-0151.077.58843227.0
82023-02-0156.061.75801665.0
92023-08-0142.046.10786851.0
102024-02-0178.068.69685056.0
112024-08-0148.080.53351642.0
122025-02-0163.082.55499678.0
132025-08-0144.053.78248448.0
Table 13. Robustness map for headline conclusions.
Table 13. Robustness map for headline conclusions.
Headline Conclusion (What We Claim)Robust Across Checks?Sensitivity/How We Phrase It
Family ordering: app (C) most event-reactive; platform (A) muted; ecosystem (D1) selective but largeYesReport as qualitative ordering + consistent signs; avoid over-weighting any single cell’s p-value
Back-office milestone (“central administration full”) has no measurable public salienceYesTreat as null/near-zero and consistent with low public visibility
Rollout milestones coincide with app-focused surges (go-live, B2G Phase 1)MostlyTiming strength can vary with ±1–2 month lag; present as event-timed association, not causal
Harmonization (VAT–myDATA alignment) coincides with app declines and ecosystem rebalancingMostlyMagnitudes depend on transformation; emphasize SVI point Δ and direction, not %
Two “anchor” ecosystem shifts dominate D1 (down at B2G rest-of-public; up at VAT alignment)YesKeep as anchors; acknowledge D1 volatility and that other D1 events are weaker/mixed
Nowcasting gains are strongest for platform series (A2/A3)YesFrame as series- and horizon-conditional; strongest at h = 1 and still favorable at h = 2–3
D1 nowcasting at h = 1 does not improve vs. seasonal naïveYesExplicitly report as a failure case; gains (if any) appear mainly at h = 2–3
Very large percentage changes (±300–500%) reflect meaningful behavioral shiftsNo (not claimed)Recast % as context-only; stress low baselines → mechanical inflation, and avoid “policy success” language
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Balaskas, S. Policy Shocks and Public Attention to Digital Tax in Greece: Event-Study and Nowcasting with Google Trends Time Series. Account. Audit. 2026, 2, 6. https://doi.org/10.3390/accountaudit2020006

AMA Style

Balaskas S. Policy Shocks and Public Attention to Digital Tax in Greece: Event-Study and Nowcasting with Google Trends Time Series. Accounting and Auditing. 2026; 2(2):6. https://doi.org/10.3390/accountaudit2020006

Chicago/Turabian Style

Balaskas, Stefanos. 2026. "Policy Shocks and Public Attention to Digital Tax in Greece: Event-Study and Nowcasting with Google Trends Time Series" Accounting and Auditing 2, no. 2: 6. https://doi.org/10.3390/accountaudit2020006

APA Style

Balaskas, S. (2026). Policy Shocks and Public Attention to Digital Tax in Greece: Event-Study and Nowcasting with Google Trends Time Series. Accounting and Auditing, 2(2), 6. https://doi.org/10.3390/accountaudit2020006

Article Metrics

Back to TopTop