Next Article in Journal
From E-Government to AI E-Government: A Systematic Review of Citizen Attitudes
Previous Article in Journal
Digital Cultural Heritage in Southeast Asia: Knowledge Structures and Resources in GLAM Institutions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Impact of the 2023 Wikipedia Redesign on User Experience

Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, USA
*
Author to whom correspondence should be addressed.
Please contact Dr. Prajjwal Gandharv with any questions after the paper is published.
Informatics 2025, 12(3), 97; https://doi.org/10.3390/informatics12030097 (registering DOI)
Submission received: 15 July 2025 / Revised: 22 August 2025 / Accepted: 29 August 2025 / Published: 16 September 2025

Abstract

In January 2023, Wikipedia introduced its most significant user interface (UI) redesign in over a decade, aiming to improve readability, accessibility, and navigation across devices. Despite the scale of this change, little empirical work has assessed its actual impact on user behavior. This study employs a natural experiment framework, leveraging Wikipedia’s exogenous, site-wide redesign date and large-scale, publicly available data—including clickstream, pageview, and edit histories—to evaluate user experience before and after the change. Using a quasi-experimental design, we estimate an immediate jump of ~1.06 million monthly internal link clicks at launch, while average hourly pageviews in January rose 1.25% despite a one-time –1.79 million dip at rollout. These results highlight the potential of large-scale UI changes to reshape user interaction without broadly alienating users and demonstrate the value of quasi-experimental methods for Human–Computer Interaction (HCI) research. Our approach offers a replicable framework for evaluating real-world design interventions at scale.

1. Introduction

Wikipedia is the fifth-most visited website in the world [1]. Notably, it is the most visited non-profit website [2] and is broadly considered an example of utopian web infrastructure due to its open, collaborative, and freely accessible nature [3]. Content on Wikipedia is voluntarily and democratically created and monitored [4]. Wikipedia is arguably the most important source of democratized information in the world and is a cornerstone of the digital knowledge ecosystem [5].
In 2023, Wikipedia underwent a major user interface redesign. This redesign simultaneously revised multiple components of the UI at once and has been considered to be the site’s biggest user interface (UI) up-grade since 2010—a time at which Wikipedia had only a fraction of the traffic it has today [6]. This redesign involved changes aimed at improving readability, accessibility, and navigation across both desktop and mobile platforms. Key aspects of the redesign included a more streamlined layout, a collapsible sidebar for better focus on content, an improved search function with more detailed previews, and a revised table of contents that remains visible while scrolling. The changes were intended to modernize Wikipedia’s appearance while only having a limited impact on its core functionality, maintaining the site’s usability that has made it one of the most trusted sources of information online.
Even though this was a massive redesign of one of the most important websites and knowledge sources in the world, very little academic research has actually examined what the impacts of this redesign were. While the Wikimedia Foundation’s goal was to improve user experience [6,7,8,9], and some aspects of the redesign were trialed iteratively, the true effects of major UI redesigns are often not fully known until universal implementation. Indeed, drastic UI changes can often disorient continuing users and spark backlash [10].
In this research project, we seek to analyze the effects of the 2023 Wikipedia redesign on user experience, drawing on publicly available, archived Wikipedia article edit histories, article talk pages, user engagement data, and clickstream data. The richness and longitudinal nature of Wikipedia data make it uniquely well-equipped for exploring how UI redesigns can impact users across multiple dimensions. While this specific research project is about Wikipedia, our analysis, as a case study, can provide broadly applicable information about how large-scale UI redesigns can impact users and how quickly users can adjust to redesigns.

Gap and Problem Statement

On 18 January 2023, Wikipedia deployed its first major desktop redesign in a decade. Despite abundant commentary, there is still no population-scale, empirical assessment of a globally coordinated, one-time redesign on a reference platform of Wikipedia’s size. The unresolved problems are whether surfacing navigational scaffolding (e.g., persistent table of contents, sticky header) increases in-site traversal without depressing overall demand and how quickly any short-term disorientation resolves.
Our contributions are threefold:
-
We provide a replicable natural-experiment framework (RDD + paired comparisons) for evaluating real-world UI launches with platform telemetry (clickstream, pageviews, edit activity).
-
We deliver platform-scale estimates that separate immediate discontinuities at launch from post-launch normalization, enabling design-relevant interpretation (navigation vs. demand).
-
We translate findings into actionable guidance for large, content-dense sites (prioritize navigational scaffolding/readability; modernize without dismantling expertise; pair rollouts with disciplined measurement).
Ultimately, this paper examines the effect of Wikipedia’s 2023 UI redesign on user navigation and activity in a natural experiment design and statistical analysis (RDD, t-tests, chi-squared tests).

2. Related Works

2.1. Theoretical Models in UI Design

Correctly balancing UI adaptation requires a finger on the pulse of many different veins, from innovation to accessibility to user familiarity. Norman [11] describes that the main principles that should guide UI design are simplicity and visibility. In line with recommendations for usability engineering [12], the Wikipedia team implemented an iterative approach for developing the new UI design. This approach is discussed at length by the engineers who orchestrated it [13]. A critical component of the upgrades they made was the improvement of search functionality. Research indicates that visual elements can significantly impact user behavior [14]. The updated search feature incorporates improved categorization and is more readable, which streamlines content retrieval for all users regardless of experience with the site [7]. This decision positively aligns with the findings of past HCI research [15], which suggest that giving users too many search results tends to overwhelm users and lead to reduced satisfaction. Notably, past research on this topic [15] is based on very small sample sizes, motivating the need for higher-powered research on the topic. While less evident in today’s world of UI, performance optimization was another focal point for the redesign team. The Wikimedia Foundation reported that it implemented page element restructuring and back-end improvements that contributed to an improved user experience. These findings are consistent with broader HCI research, which states the importance of speed in digital interactions when it comes to user experience [16].
Cockburn and colleagues’ [17] framework for assessing menu designs provides a framework for hypothesizing how the Wikipedia UI redesign might function. Their model integrates Fitts’ Law (pointing time), Hick–Hyman Law (decision time), and the transition from novice to expert behavior to predict menu selection efficiency. Cockburn and colleagues [17] highlight that split menus with high item movement impede expertise acquisition. The collapsible sidebar and persistent table of contents streamline navigation, potentially decreasing decision time when locating specific sections or tools. The redesign’s focus on a cleaner interface with appropriately sized interactive elements may reduce pointing time, aligning with Fitts’ Law. Cockburn and colleagues’ [17] framework does suggest one negative feature of the UI redesign. If the redesign altered long-established menu structures, frequent users would experience an increase in search time before re-learning the new layout. This temporary setback is predicted by the model’s expertise transition component. However, the stability of the new UI design (no additional major changes) would suggest that users gradually develop familiarity with the new interface as time goes on.

2.2. Behavioral Impacts of Redesigns

UI adjustment, while necessary, can inevitably be a point of friction for the existing user base. Research has demonstrated that major changes to a known interface can disrupt ingrained user habits, leading to temporary declines in user efficiency and satisfaction [18]. Wikipedia’s UI overhaul serves as a unique case study for multiple reasons. First, Wikipedia is one of the most used knowledge sources on the internet, and second, it has the unique quality of making a large amount of its user data available to the public. There have been attempts to quantify user engagement in the past, and it is not a simple task. Attfield and colleagues [19] discuss the challenge of quantifying such a thing and note that subjective and objective performance often diverge significantly. Because of the scope of this study and the nature of the data available, we have chosen to focus on the objective side of the analysis.
Some past research has explored the impacts of large-scale platform redesigns across various large platforms such as Twitter/X, Facebook, and Google [20]. This past work provides a framework for studying redesigns’ measured effects as well as how users receive changes. As past research has focused on search engines or social media platforms, Wikipedia is a uniquely valuable platform to explore the effect of UI redesigns on. Because past research has looked at UI redesigns that include a systemically different set of changes, it is hard to hypothesize about the Wikipedia redesign directly based on this past work.

2.3. Accessibility and Mobile Usability

When evaluating the fundamental concerns driving Wikipedia’s UI redesign, accessibility emerged as a primary factor. One of the key goals of the redesign was to create an experience that was more adaptable to a diverse user base. The goal was to create an experience that was both familiar enough for legacy users and intuitive enough for new users to feel comfortable. One of the key factors of this change was improved readability across different devices. This aligns with the principle that reduced cognitive load improves user accessibility [21]. Since Wikipedia’s last UI update, site visits happen at a substantially greater rate on mobile devices than they did in 2010. Graham [8] highlights that changes like increased font size and a prominently displayed tables of contents were instrumental in increasing usability on smaller screens. This train of thought aligns with other trends in HCI as a field, where mobile accessibility is given increasing priority when it comes to trying to improve the user experience. This rethinking of things like the table of contents placement and persistent headers aims to reduce the cognitive load and maximize efficiency, as supported by usability studies [22]. In addition, changes in visual complexity, such as font size, table of contents placement, can positively affect user engagement and comprehension [23].
In a paper published by one of the senior software engineers at Wikimedia, it is emphasized that long-term users have often already developed navigation habits that make information retrieval very efficient. While this is obviously a good thing, new users may not have such luck. Informed by user testing, to increase accessibility for users who did not have extensive knowledge of the site, the redesign opted to introduce features like collapsible panels and advanced search functionality in hopes that they would reduce the need for excessive scrolling or other navigational pitfalls [24]. These design choices lead to another complex problem for these developers, which is the growing emphasis in UI design to accommodate new users without significantly hampering the efficient patterns of long-term ones.
Beyond general accessibility, mobile-first and responsive design research recommends starting with the smallest viewport and progressively enhancing for larger screens, which prioritizes core tasks and reduces unnecessary complexity for mobile users. Classic treatments [25,26] frame this approach, while W3C clarifies that mobile accessibility is covered by WCAG (not a separate standard) and provides mobile-specific guidance. We therefore interpret the redesign through key WCAG 2.2 AA criteria that are especially salient on mobile—for example, Reflow (1.4.10), Text Spacing (1.4.12), Focus Not Obscured (Minimum) (2.4.11), and Target Size (Minimum) (2.5.8)—all of which relate directly to sticky headers, collapsible navigation, and tap targets in the new UI. At the same time, applied UX work cautions that “mobile-first” is not “mobile-only”; desktop density and information scent should be preserved to avoid content dispersion on large screens.

2.4. Research Gap and Study Rationale

Past research motivates a variety of research questions that analyzing the impact of the Wikipedia redesign is able to answer. First, a variety of research suggests that simpler, more streamlined UIs lead to improved user experience. For example, one previous study [15] demonstrated that more choices in search results paradoxically lead to reduced user satisfaction. That study is highly cited, but is notably based on a very small sample of only 24 users. An analysis of Wikipedia provides a much stronger test of the hypothesis, as the Wikipedia data is extremely powerful and can assess effect estimates with extremely small confidence intervals.
Additionally, past HCI research [18] suggests that abrupt UI changes can negatively affect the user experiences of experienced users. The implementation of the Wikipedia redesign offers a perfect test of that theory for a number of reasons. For one, the scale of activity is well-powered enough to detect a wide range of effect sizes—including even very small effects. Second, Wikipedia users—particularly editors—are an especially highly experienced user base [27]. Finally, the rich, longitudinal nature of Wikipedia data means that if there is a negative effect, we can also explore how long the effect lasts, which sheds light on an important question that small experimental designs are generally unequipped to answer.
Taken together, theories of efficient interaction (e.g., pointing and decision time), evidence on habit disruption after interface changes, and accessibility/mobile-usability research jointly predict that a redesign which lowers decision/pointing costs and surfaces persistent navigational scaffolding should increase in-site link traversal while leaving overall demand broadly stable, with any short-term disorientation attenuating as habits re-form. Yet, despite case-specific accounts from platforms like Facebook, Twitter/X, and Google, there remains—so far as we can determine—no empirical assessment of a globally coordinated, world-wide redesign rolled out in one step across a reference site at Wikipedia’s scale. We therefore position the January 2023 launch as a rare natural experiment that can adjudicate these predictions with population-level behavioral traces, which directly motivates our research questions in Section 3.

3. Research Questions

  • How did the redesign impact user discussions, page usage, page edits, and user clickstream?
  • Did the redesign negatively impact user experience in the short term? And if so, what was the duration of that impact?
  • What was the long-term impact of the redesign on how users engage with Wikipedia articles?

4. Methods

To explore how Wikipedia’s UI change affected user interaction, this research will take a data-driven approach centered around analyzing the vast quantities of clickstream, page editing, and page traffic data that Wikipedia has made available to the public. The goal is to translate our core research question, “How did the redesign impact user discussions, page usage, page edits, and user clickstream?”, into a systematic methodology that provides concrete insights and an artifact in the form of a data analysis pipeline and a set of visual and statistical reports.

4.1. Implementation of the Redesign

4.1.1. System UI Design Development

The 2023 Wikipedia redesign was a large-scale, community-driven effort to modernize what is practically the world’s biggest accessible-to-all and public-input-driven encyclopedia [8]. Although Wikipedia was already a well-functioning system, the redesign was opportunistic and driven by a need to future-proof the website by enhancing accessibility, improving reading and navigation experiences, and better accommodating the diverse needs of its global user base. It was also particularly important because, over the years, the interface had become more and more geared towards the needs of editors [8]. At the same time, over 99 percent of people using the website do not edit.
Wikipedia’s ecosystem includes projects like WikiData and Wikimedia Commons, which share a common interface but have unique functional requirements [8]. On top of that, supporting around 300 languages demanded an organic and decentralized design approach. Wikipedia had to rely heavily on individual contributors who were familiar with different details of the various parts of its ecosystem. The results were impressive, as shown by the fact that the redesign was successfully implemented, even though detailed cataloging of every change to assist in coordination was practically impossible [8]. This decentralization most prominently spilled over into decision-making on the final changes. Sometimes, contributors opposed clearly beneficial UI changes, such as the sidebar table of contents, because they preferred the familiarity of the inline table of contents [8]. This led to significant debate about whether customization options should be available to people who prefer other UI styles. Typography was another area where the design systems team at Wikipedia and individual contributors had to negotiate. Wikipedia understands how its typography is a representation of its legibility and quality as a knowledge platform, while also acknowledging that the typography eventually represents all the content and work of individual contributors, who want the knowledge they share with the world to be typed in certain ways based on personal preferences and other factors [8]. Wikipedia’s own design systems team worked on the type-sizing, base font size, and managing line-length by adding a max-width while continuously receiving feedback from the contributor community about many other changes they were proposing to implement and compromising with them over some differences.
To accommodate widespread user bases and their needs across the world, several other key changes were made [8]. For users who use small screens or have multiple tabs open, Wikipedia’s design team added support for browser widths as low as 500 pixels. Vocal members of the community pushed the design systems team to introduce an explicit “viewport” size in webpage markup because they were annoyed that the table of contents component was collapsing inconsistently in browsers [8]. Sticky headers and a search bar with real-time suggestions and image previews were added. The language button was moved to a more visually prominent position, and a collapsible navigation sidebar was added. Many of these changes were carried out through iterative prototyping, which I will describe in the next subsection.

4.1.2. Prototyping

Given the scale and complexity of Wikipedia’s infrastructure, relying on static prototypes was not an option. Wikipedia applied a large number of small prototypes built with HTML, CSS, and JavaScript to iteratively achieve their final agreed-upon redesign [13]. This was important due to Wikipedia pages being interactive and being offered in around 300 languages. These prototypes would enable clear and concise conversations among contributors and developers. Often, for comparison purposes, a base prototype was first built, which was attached to API’s to load a Wikipedia page in any language. Other prototypes were then added on top of it, and a switch menu was provided to allow for quick switching among them and comparing them easily.
The impact of this prototyping process was evident in the decision about how the table of contents should be transformed [13]. The table of contents is a critical tool to deal with long articles. The team and community members came up with ideas—whether to keep it inline or turn it into a sidebar. The inline table of contents has been Wikipedia’s way of doing it for a long time, but it is only visible after you scroll past a lead section, and you cannot immediately come back to it to navigate if you have scrolled too far down the page. Another idea was to include collapsible subsections in the table of contents for long articles. Through prototype testing on five different options in 3 countries, the decision was made to have a sidebar table of contents with collapsible subsections.

4.1.3. Testing and Rollout of the New UI

The rollout had to be gradual to make sure that users were not upset by sudden changes to familiar systems and any potential issues could be caught and fixed [13].
Changes were first rolled out to a few Wikipedia language communities (Basque, French, Hebrew, Persian, among others), and this served as an A/B tester before the full-blown rollout. Continuous feedback from and collaboration with users and volunteer developers through message boards and Phabricator tasks ensured that Wikipedia could keep up with the opinions and demands of the public [13].
Performance was an important factor in Wikipedia’s decisions [8]. They have a performance dashboard, called Navigation Timing API, which they leverage to collect global data from users. Wikipedia also runs automated synthetic performance tests using Sitespeed.io. Wikipedia can also use key metrics to monitor their rollout: pageviews, edit rates, account creation, and session length. One of the biggest concerns was how replacing the internal search feature might turn away users if it became too slow or unresponsive. Wikipedia added instrumentation specifically designed to monitor and address this [8]. They also monitor bundle sizes of render-blocking CSS assets. The CI pipeline blocks anything that goes over the performance budget. Wikipedia also runs spikes to see if there are additional ways to improve performance. For example, in a quiet period, Wikipedia ran a spike, which made the mobile site 300 ms faster.
Hence, Wikipedia was able to successfully roll out the new UI design due to its goal-driven design development, iterative prototyping, and continuous monitoring of feasibility and performance test results.

4.2. System Design Overview

Our system is centered around the acquisition, processing, and analysis of Wikipedia’s publicly available user data. This entails identifying significant interaction metrics that could result from the UI change, including internal link traversal rates, article entry points, navigation patterns (search bar usage vs. link usage), and general page view data. The outcome will be a comparative analysis between pre- and post-UI-redesign behavior.

4.3. Analysis Steps

The system will be implemented in a pipeline format made up of the following stages:

4.3.1. Data Acquisition

Clickstream, pageview, and revision history data are available through the Wikimedia Foundation’s monthly releases. We will specifically target varying time frames before and after the UI change rollout.

4.3.2. Preprocessing and Cleaning

Our various datasets will be filtered to remove noise (infrequently accessed pages, non-human traffic).

4.3.3. Data Normalization

Due to factors outside the scope of our study, traffic volumes will likely vary month to month. Normalization techniques, such as z-score standardization or percentage-based metrics, will be applied to ensure comparability over time.

4.3.4. Metric Extraction

Key interaction metrics, such as internal-link click-through rates, search bar usage percentages and volumes, entry article frequencies, and article edit frequency, will be computed to provide as comprehensive an analysis as the data allow.

4.3.5. Visualization and Analysis

Finally, the system will output charts and statistical summaries to allow the user to detect significant changes in behavior caused by the UI redesign.

4.3.6. Descriptive Composition Check

We cross-tabulate counts and percent distributions before and after the redesign as a descriptive summary. Because per-click independence can be violated in clickstream data, we do not use chi-squared p-values for inference. Instead, our main inferences rely on the regression discontinuity design described below.

4.4. Design and Technology Requirements

4.4.1. Technologies Used

  • Python (Pandas, NumPy, Matplotlib, Seaborn);
  • Jupyter Notebooks;
  • Wikimedia API and dump files.

4.4.2. Infrastructure Needs

  • Local or cloud-based compute resources capable of handling large datasets.
  • Sufficient RAM and storage for preprocessing and visualization tasks.

4.5. Study Design

To evaluate the impact of the new UI design, we plan to leverage the exogenous, point-in-time nature of the drastic change. The redesign was introduced on January 18th, 2023. The introduction did not coincide with any other changes to Wikipedia or its user base (to the best of our knowledge), so any observable changes in platform user patterns can be plausibly attributed to the UI re-design. This is essentially a straightforward “natural experiment”, which has been leveraged in previous research as an analytical approach in HCI research [28,29,30].
To measure the impact of the natural experiment, we will specifically utilize a regression discontinuity design. A regression discontinuity design functions by estimating polynomial trends before and after an intervention and assessing the impact of the intervention as the size of the discontinuity between the two trends at the cutoff point (the time of the intervention). Calonico and colleagues’ [31] regression discontinuity estimator is robust to bandwidth choice and is considered state-of-the-art for regression discontinuity design.
The validity of our regression discontinuity design relies on several key assumptions. First, the continuity assumption requires that, in the absence of the UI redesign, user behavior would have followed a smooth trend over time, ensuring that any observed discontinuity at the intervention point is attributable to the redesign rather than an underlying trend or external shock. Second, the no-manipulation assumption assumes that users could not selectively determine their exposure to the new UI, meaning there was no systematic sorting around the cutoff date. Given that the redesign was implemented site-wide without user discretion, this assumption is plausibly satisfied. Third, the local randomization assumption posits that users just before and after the intervention are comparable, akin to a randomized experiment within a narrow window around the cutoff. While we cannot directly test this, we have no reason to suspect this would be violated, given the exogenous nature of the UI change. Finally, the functional form assumption requires that our chosen polynomial or local linear model appropriately captures the underlying trends in user behavior. To ensure robustness, we will implement sensitivity analyses with different bandwidth choices and alternative specifications, following the recommendations of Calonico and colleagues [31].

4.6. Task and Procedure

The UI design was implemented as a natural experiment. User behavior was measured as it would naturally occur on the website. No procedure, except for the implementation of the new design on 18 January 2023, occurred.

4.7. Measurement

Our main analysis will involve three outcome measures: internal-link click-through rates, search bar usage volumes, and entry article frequencies. All variables will be operationalized as count variables aggregated to the day level. The number of days before/after 18 January 2023, will constitute the “running” variable as is the traditional terminology in the regression discontinuity literature [31].

4.8. Analysis

Our primary techniques for analyzing the data at different levels of granularity were the paired t-test, chi-squared test, and regression discontinuity jump test. The paired t-test is a good indicator for the long-term effects of the UI change, while the regression discontinuity jump test is a complementary evaluation that allows us to see shorter-term, sharp changes due to significant events. The regression discontinuity test has previously been used to determine the effectiveness of treatment plan changes in epidemiology [32]. We believe that this approach to analyzing data before and after a known change was introduced translates well to the field of Human–Computer Interaction with respect to quantifying the effectiveness of design changes.
For paired comparisons, we used a t-test on the within-unit differences. The assumption pertains to the distribution of the differences; with our sample size, the test is robust to modest deviations from normality. We inspected distributional diagnostics (histogram/QQ plot) to check for gross departures. As recommended, we report effect sizes (Cohen’s d for paired differences) with 95% confidence intervals in addition to p-values. A Wilcoxon signed-rank test is prespecified as a robustness check if diagnostics suggest severe non-normality. We inspected distributional diagnostics for the paired differences and observed no severe departures from normality; accordingly, we report t-tests with effect sizes.
The format of the Wikipedia clickstream dataset allows us to categorize the different referrer types to any given article into one of three groups:
  • Link: if the referrer and request are both articles and the referrer links to the request.
  • External: if the referrer host is not en(.m)?.wikipedia.org.
  • Other: if the referrer and request are both articles, but the referrer does not link to the request. This can happen when clients search or spoof their refer.
To measure the relationship between these referral types over time, we implemented the chi-squared test.
We additionally explore the impact of the UI change on overall pageviews. This provides a strong test of whether or not the UI change affects the site’s popularity.

4.9. Participants

Participants in this study are all users of Wikipedia. English Wikipedia averages approximately 1 billion unique device accesses a month, and in our dataset, Wikipedia makes a best-effort attempt to only record real human traffic, excluding bots and crawlers from its metrics. The way in which they determine human traffic is not disclosed. All data preprocessing was performed by Wikipedia. We simply transformed the data provided by Wikipedia into a panel vector suitable for our analyses. We explored all primary data outcomes Wikipedia makes publicly available, with the exception of qualitative metrics (e.g., comments and edits).

5. Results

Our main analyses are performed on multiple time scales, depending on the finest level of granularity available. We analyze several key metrics to assess how the site is functioning. Primarily, we analyze clickstream data. This data depicts how the site is used and how specific webpages are reached. In addition, we look at pageview data to evaluate the impact of the UI change on overall usage.
Wikipedia’s clickstream data is only timestamped at monthly intervals, so we chose to gather two years’ worth of data (January 2022–December 2023) for analysis. From that dataset, we applied paired t-tests, regression discontinuity (RD) tests, and a chi-squared test to investigate changes in user click behavior before and after January 2023.
Table 1, Table 2 and Table 3 present results of the statistical tests. We found statistically significant increases in monthly link referrer–page pairs after the cutoff. Notably, link clicks (measured as referrer–page pairs) showed a statistically significant discontinuity at the January 2023 threshold (p = 0.0236), suggesting an abrupt behavioral shift. This shift amounted to an estimated effect of 1,057,288.6 increase in link clicks. External and “other” referrer–page clicks did not show significant discontinuities, indicating the UI change did not substantially shift these user patterns.
Figure 1, Figure 2 and Figure 3 visualize these three regression discontinuities. Our pre-intervention period ran from 1st January 2022, to 18th January 2023, and the post-intervention period ran from 28th January 2023, to 31st December 2023. Figure 1 presents the results for external clicks. Through the 12 months of 2022, external clicks gradually increased. In 2023, external clicks continued to gradually increase over time. While the visualization shows a slight discontinuity at the time of the intervention (January 2023), the size of the discontinuity is not large enough to be statistically distinguishable from zero. The figure trends overall suggest external clicks have grown over time on Wikipedia, with the timing of the UI redesign having no discernible impact on external clicks.
Figure 2 presents the results for link clicks. Similar trends are observed here, with a slight increase in clicks through both 2022 and 2023. Notably, a large discontinuity is present at the time of the intervention. This jump represents an absolute increase of ~1.06M link-click pairs at the cutoff; monthly 2023 values were consistently higher than 2022. This difference is large enough to be considered statistically significant (p < 0.05). Notably, observing individual points in 2022 compared to 2023 reveals that point clusters are quite separable; virtually every month in 2023 had a volume of link clicks greater than every month in 2022. The trends suggest that, over 2022 and 2023, link clicks increased substantially on Wikipedia. The UI redesign appears primarily responsible for this increase.
Figure 3 presents an analogous figure for other clicks. The figure shows that other clicks declined throughout 2022, before remaining flat throughout 2023. The figure reveals no significant discontinuity. The figure trends suggest that the timing of the UI redesign is, if anything, associated with a halt in the decline of other clicks. Additionally, we examined total monthly clicks by referrer type. These trends were similarly upward, with all types showing statistically significant t-test results, though none demonstrated statistically significant jumps in RD analysis. The chi-squared test on click type proportions yielded a highly significant result (p < 0.0001), though the effect size was small (Cramér’s V = 0.0125).
Alternatively, Wikipedia’s page view data is timestamped at hourly intervals. Due to computational limitations, we were limited to one month of page view data for analysis (January 2023). Average hourly pageviews increased by 1.25% post-cutoff, with an immediate –1.79 M dip at launch that normalized quickly. The paired t-test revealed a statistically significant increase in average hourly page views after the cutoff (p = 0.00063), rising from 22,084,079 to 22,359,782—an increase of 1.25%. However, the RD test found a significant negative discontinuity at the cutoff point itself (p = 0.00274), with an estimated immediate drop of 1,787,240 views. There are a few reasons why we might reasonably expect page views to drop after the UI implementation. The abrupt UI redesign might be jarring for returning users, who might find the new interface confusing, or even difficult to use initially, which may reduce page views at first. Notably, average hourly views over the remaining post-period are higher, suggesting rapid normalization.

6. Discussion

As one of the most visited websites in the world and a majorly important source of democratically accessible information, understanding the factors that make Wikipedia more or less accessible and usable is of central importance to global flourishing. In this paper, we have applied quasi-experimental research methods to analyze how the 2023 Wikipedia redesign, the site’s most substantial in a decade, impacted user experience. In practical terms, the redesign redirected ~one million additional internal navigations per month without materially reducing overall traffic. We found minimal evidence of the redesign significantly impacting the overall quantity of visitors to the site. We did find that the redesign impacted the clickstream significantly, increasing link clicks substantially. This finding contributes to an understanding of how subtle UI changes can generate massive changes in user experience at a global scale.
This paper additionally provides a framework for applying natural experiments in HCI research. While some past HCI studies have employed quasi-experimental designs, they generally remain quite uncommon [28,29,30]. We implore HCI researchers to use such designs more regularly for three reasons. First, HCI is uniquely suited as a social-science-adjacent field where large-scale interventions (e.g., UI changes) are constantly taking place and there is ample large-N data with which to draw on. Second, HCI is entirely geared towards informing better designs and better interventions, so research that approximates causal estimands are especially helpful. Lastly, as econometric techniques are rapidly evolving, engagement with such methods allows for better dialog between HCI and other disciplines.

6.1. Theoretical Implications

Our findings connect directly to core HCI theories about how surfacing navigational scaffolding changes behavior in content-dense systems. The immediate and durable rise in in-site link traversal after launch is consistent with usability heuristics that emphasize visibility and user control: the scroll-persistent table of contents (TOC) and sticky header keep key actions perceptible at the moment of need, strengthening the information scent and reducing the way-finding effort. Framed by cognitive load theory, the TOC externalizes the page structure and lowers the extraneous load by offloading section memory, enabling users to navigate deeper within articles without additional search. Mechanistically, these elements shorten the decision time (Hick–Hyman) and pointing time (Fitts), which explains why internal navigation scales up even as the overall demand remains stable after a brief adjustment period. The transient rollout dip followed by normalization aligns with habit disruption/re-attunement accounts: expert users incur a short relearning cost, then reestablish efficient routines once the new scaffolding is routinized. Theoretically, large-scale redesigns that add persistent, low-friction navigational cues—rather than removing familiar structures—can shift behavior toward richer within-site exploration without depressing traffic, suggesting a general principle for reference platforms: prioritize persistent, context-coupled navigation that minimizes cognitive and motor costs while preserving expert workflows.

6.2. Limitations

This research does have several limitations, however. For example, while our results speak to the impact of the Wikipedia UI redesign, they cannot speak to what specific UI changes were especially impactful. Did the clickstream shift as a result of the sticky header? As a result of the persistent table of contents? Interpreting these results with a theoretical background is necessary to better disentangle the impact of different factors from one another. Future research that engages with similar natural experiment designs, where only one change, or even a smaller subset of changes, is made, can further disentangle the specific factors that matter. Additionally, the Bayesian methodology, informed by past studies, may shed additional light on what changes drive what effects. As an additional limitation, we also do not have any data breakdown by user demographics, so we have no way of telling if there were heterogeneous effects by user demographics (mobile vs. desktop or new vs. returning users, for example). Future studies that have richer datasets may be able to better answer such questions. Lastly, our clickstream lacks user identifiers, so repeated actions by the same individuals may induce dependence among observations. We therefore treat chi-squared comparisons as descriptive only and rely on regression discontinuity for inference.

6.3. Implications for UI Design

To increase the paper’s practical impact, we conclude with design implications that translate our evidence into guidance for UI teams. First, strengthen in-site navigation by elevating low-friction, high-salience paths, e.g., persistent/collapsible tables of contents, sticky headers, and clearer on-page link affordances, and track the internal click-through and path diversity as leading indicators. Second, modernize without dismantling expertise: favor additive, progressively disclosed changes over removals; preserve learned workflows; and provide power–user shortcuts or opt-outs alongside novice-friendly defaults. Third, pair rollouts with disciplined measurement: stage deployments, enforce performance budgets, and predefine quasi-experimental evaluations (e.g., regression discontinuity around launch or stepped-wedge rollouts), segmented by device class and user tenure to detect short-term disorientation versus durable gains. For content-dense reference sites, prioritize readability (typographic scale, line length) and navigational scaffolding over purely esthetic revisions. Our openly replicable pipeline enables teams to adopt these practices and evaluate their own redesigns with comparable, policy-relevant metrics.

7. Conclusions

This study shows that a global UI redesign can significantly impact navigation behavior, evidenced by an immediate ~1.06M jump in monthly internal link clicks at launch and a 1.25% increase in average hourly pageviews in January despite a short-lived –1.79 M dip at rollout. This research contributes to an emerging framework in HCI research that leverages natural experiments to evaluate the impact of UI redesigns. By applying this framework to Wikipedia, one of the most popular websites in the world, and specifically to their 2023 redesign—which was implemented at a specific point in time—we contribute to research on how a specific subset of UI features, like sticky headers, can impact how a site is used. We observe a major uptick in link click referrals at the time of the redesign’s implementation, providing evidence of a large behavioral change. Taken together, the findings advance a generalizable design principle—surface-persistent, low-friction navigational scaffolding to increase within-site exploration without broad demand loss—and offer a replicable RDD + paired-comparison evaluation template for population-scale launches that separates immediate discontinuities from post-launch normalization.

Author Contributions

Conceptualization, K.V.; Methodology, T.W. and K.V.; Software, T.W.; Formal analysis, T.W.; Data curation, T.W.; Writing—original draft, T.W., P.G. and K.V.; Writing—review & editing, T.W., P.G. and K.V.; Visualization, T.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in https://dumps.wikimedia.org/ (accessed on 16 February 2025).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Semrush. Top Websites in the World—January 2025. Available online: https://www.semrush.com/website/top/ (accessed on 16 February 2025).
  2. Wikipedia. 2024. Available online: https://en.wikipedia.org/wiki/Wikipedia (accessed on 16 February 2025).
  3. Wright, E.O. Envisioning Real Utopias; Verso Books: London, UK, 2010; Available online: https://www.versobooks.com/ (accessed on 16 February 2025).
  4. Wikipedia Contributors. Wikipedia: Wikipedians. Available online: https://en.wikipedia.org/wiki/Wikipedia:Wikipedians (accessed on 16 February 2025).
  5. Wikipedia Contributors. Democratization of Knowledge. Available online: https://en.wikipedia.org/wiki/Democratization_of_knowledge (accessed on 16 February 2025).
  6. Perez, S. Wikipedia Gets Its First Makeover in over a Decade and It’s Fairly Subtle. 2023. Available online: https://techcrunch.com/2023/01/18/wikipedia-gets-its-first-makeover-in-over-a-decade-and-its-fairly-subtle/ (accessed on 1 May 2025).
  7. Wikimedia Foundation. Wikipedia Gets a Fresh New Look: First Desk- Top Update in a Decade Puts Usability at the Forefront. 2023. Available online: https://wikimediafoundation.org/news/2023/01/18/wikipedia-gets-a-fresh-new-look-first-desktop-update-in-a-decade-puts-usability-at-the-forefront/ (accessed on 1 May 2025).
  8. Graham, G. Behind the Curtains of Wikipedia Redesign. 2023. Available online: https://www.smashingmagazine.com/2023/06/behind-curtains-wikipedia-redesign/ (accessed on 1 May 2025).
  9. New Wikipedia Editor Features Make It Easy for Everyone to Contribute. 2023. Available online: https://medium.com/freely-sharing-the-sum-of-all-knowledge/new-wikipedia-editor-features-make-it-easy-for-everyone-to-contribute-e09135ce9275 (accessed on 1 May 2025).
  10. Netflix New Homepage Design Teaser User Feedback. 2023. Available online: https://www.thesun.co.uk/tech/29475408/netflix-new-homepage-design-teaser-user-feedback/ (accessed on 1 May 2025).
  11. Norman, D. The Design of Everyday Things: Revised and Expanded Edition; Basic Books: New York, NY, USA, 2013. [Google Scholar]
  12. Nielsen, J. Usability Engineering; Elsevier: Amsterdam, The Netherlands, 1994. [Google Scholar]
  13. Hollender, A. Design Notes on the 2023 Wikipedia Redesign. 2023. Available online: https://uxdesign.cc/design-notes-on-the-2023-wikipedia-redesign-d6573b9af28d (accessed on 1 May 2025).
  14. Harvard Study Finds Image Search Tools Can Change Customer Behavior. 2023. Available online: https://www.inc.com/kim-jao/harvard-study-finds-image-search-tools-can-change-customer-behavior.html (accessed on 1 May 2025).
  15. Oulasvirta, A.; Perkiö, J.; Schneider, B. When more is less: The paradox of choice in search engine use. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), Boston, MA, USA, 19–23 July 2009; pp. 1–10. [Google Scholar] [CrossRef]
  16. Sudden Changes in Ui: Why it’s a Bad Move. 2023. Available online: https://blog.snappymob.com/sudden-changes-in-ui-why-its-a-bad-move (accessed on 1 May 2025).
  17. Cockburn, A.; Gutwin, C.; Greenberg, S. A predictive model of menu performance. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), San Jose, CA, USA, 28 April–3 May 2007; pp. 627–636. [Google Scholar] [CrossRef]
  18. Garaialde, D.; Christopher, P.; Bowers; Pinder, C.; Shah, P.; Parashar, S.; Clark, L.; Cowan, B.R. Quantifying the impact of making and breaking interface habits. Int. J. Hum.-Comput. Stud. 2020, 142, 102461. [Google Scholar] [CrossRef]
  19. Attfield, S.; Kazai, G.; Lalmas, M.; Piwowarski, B. Towards a Science of User Engagement (Position Paper). 2011. Available online: https://www.researchgate.net/publication/228542640_Towards_a_science_of_user_engagement_Position_Paper (accessed on 1 May 2025).
  20. Kristina, G.; Czestochowska, J.; Anderson, A.; West, R. Anticipated versus actual effects of platform design change: A case study of Twitter’s character limit. In Proceedings of the ACM on Human-Computer Interaction; Association for Computing Machinery: New York, NY, USA, 2022; pp. 1–29. [Google Scholar]
  21. Sweller, J. Cognitive load during problem solving: Effects on learning. Cogn. Sci. 1988, 12, 257–285. [Google Scholar] [CrossRef]
  22. Kongshaug, P. A Usability and Universal Design Investigation into the Use of Persistent Headers in Web Pages. Master’s Thesis, OsloMet-storbyuniversitetet, Oslo, Norway, 2022. Available online: https://oda.oslomet.no/oda-xmlui/bitstream/handle/11250/3017226/kongshaug-acit2022.pdf?sequence=1&isAllowed=y (accessed on 1 May 2025).
  23. Alexandre, N.; Tuch; Javier; Bargas-Avila, A.; Opwis, K.; Wilhelm, F.W. Visual complexity of websites: Effects on users’ experience, physiology, perfor- mance, and memory. Int. J. Hum.-Comput. Stud. 2009, 67, 703–715. [Google Scholar] [CrossRef]
  24. Leung, J.; Cockburn, A. An empirical evaluation of collapsible panel interfaces. In Proceedings of the 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Gold Coast, Australia, 16–18 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. Available online: https://ieeexplore.ieee.org/document/9411552 (accessed on 1 May 2025).
  25. Luke, W. Mobile First: Preface de Jeffrey Zeldmann; Editions Eyrolles: Paris, France, 2012. [Google Scholar]
  26. Ethan, M. Responsive Web Design; Editions Eyrolles: Paris, France, 2011. [Google Scholar]
  27. Panciera, K.; Halfaker, A.; Terveen, L. Wikipedians are born, not made: A study of power editors on wikipedia. In Proceedings of the ACM 2009 International Conference on Supporting Group Work (GROUP), Sanibel Island, FL, USA, 10–13 May 2009; pp. 51–60. [Google Scholar] [CrossRef]
  28. Budak, C.; Garrett, R.K.; Resnick, P.; Kamin, J. Threading is sticky: How threaded conversations promote comment system user retention. In Proceedings of the ACM on Human-Computer Interaction; Association for Computing Machinery: New York, NY, USA, 2017; pp. 1–20. [Google Scholar] [CrossRef]
  29. Dasgupta, S.; Hill, B.M. How “wide walls” can increase engagement: Evidence from a natural experiment in scratch. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; ACM: New York, NY, USA, 2018; pp. 1–11. [Google Scholar]
  30. Narayan, S.; TeBlunthuis, N.; Hale, W.S.; Hill, B.M.; Shaw, A. All talk: How increasing interpersonal communication on wikis may not enhance productivity. In Proceedings of the ACM on Human-Computer Interaction; Association for Computing Machinery: New York, NY, USA, 2019; pp. 1–19. [Google Scholar]
  31. Calonico, S.; Cattaneo, M.D.; Titiunik, R. Robust nonparamet- ric confidence intervals for regression-discontinuity designs. Econometrica 2014, 82, 2295–2326. [Google Scholar] [CrossRef]
  32. Bor, J.; Moscoe, E.; Mutevedzi, T.; Newell, M.-L.; Bärnighausen, T. Regression discontinuity designs in epidemiology: Causal inference without randomized trials. Epidemiology 2014, 25, 729–737. [Google Scholar] [CrossRef] [PubMed]
Figure 1. External clicks.
Figure 1. External clicks.
Informatics 12 00097 g001
Figure 2. Link clicks.
Figure 2. Link clicks.
Informatics 12 00097 g002
Figure 3. Other clicks.
Figure 3. Other clicks.
Informatics 12 00097 g003
Table 1. Statistical test results—referrer–page pairs.
Table 1. Statistical test results—referrer–page pairs.
MetricMean B.Mean A.T-Test (p)RD Jump (p)
Link Clicks21.78 M23.27 M0.000000.0236
External Clicks8.88 M9.12 M0.3510.0356
Other Clicks0.99 M0.97 M0.1900.087
The bold is used to indicate statistical significance.
Table 2. Statistical test results—total clicks by referrer type.
Table 2. Statistical test results—total clicks by referrer type.
MetricMean B.Mean A.T-Test (p)RD Jump (p)
Total Clicks6.68 B7.09 B0.000280.390
Link2.15 B2.29 B0.000100.217
External4.47 B4.74 B0.000590.544
Other58.26 M56.32 M0.021680.978
The bold is used to indicate statistical significance.
Table 3. Hourly pageview statistical test results.
Table 3. Hourly pageview statistical test results.
MetricMean B.Mean A.T-Test (p)RD Jump (p)
Hourly Pageviews22,084,07922,359,7820.000630.00274
The bold is used to indicate statistical significance.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wilson, T.; Gandharv, P.; Vachuska, K. The Impact of the 2023 Wikipedia Redesign on User Experience. Informatics 2025, 12, 97. https://doi.org/10.3390/informatics12030097

AMA Style

Wilson T, Gandharv P, Vachuska K. The Impact of the 2023 Wikipedia Redesign on User Experience. Informatics. 2025; 12(3):97. https://doi.org/10.3390/informatics12030097

Chicago/Turabian Style

Wilson, Tyler, Prajjwal Gandharv, and Karl Vachuska. 2025. "The Impact of the 2023 Wikipedia Redesign on User Experience" Informatics 12, no. 3: 97. https://doi.org/10.3390/informatics12030097

APA Style

Wilson, T., Gandharv, P., & Vachuska, K. (2025). The Impact of the 2023 Wikipedia Redesign on User Experience. Informatics, 12(3), 97. https://doi.org/10.3390/informatics12030097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop