The Virtual Online Supermarket: An Open-Source Research Platform for Experimental Consumer Research

: It is controversially discussed if and which interventions policymakers should implement to promote healthier, more sustainable, and more ethical food choices. Often, policy measures suffer from a lack of data. This is especially true for the growing ﬁeld of online grocery shopping. Yet, it not always feasible to test the impact of each possible policy intervention in the ﬁeld. Here, computer-simulated shopping experiments offer a complementary approach. Recent evidence suggests that they heighten the realism of consumer experiments and collect valid data at a relatively low cost. In this paper, we introduce an open-source toolset that offers multiple avenues to develop and run experiments in the context of online grocery shopping. Hence, it supports researchers and policy makers in evaluating instore-intervention aiming to support more sustainable food choices.


Introduction
Today, more and more governments, NGOs, and players from the food industry have started to follow the United Nations' call to transition to a more healthy and sustainable food system [1]. In Western countries particularly, empirical evidence emphasizes the negative impact of the predominant dietary patterns on individuals' health, the environment, and society [2]. Hence, policy measures limited to the supply side seem insufficient to trigger a sustainable transition of the food system [3]. Consequently, there is a controversial debate about what public and private institutions should or should not promote "better food choices" [4,5].
These days, most decisions about food still take place in traditional brick-and-mortar supermarkets. However, driven by the development and diffusion of new communication technologies, grocery shopping is undergoing a change in the 21st century [6]. As one major element of this change, online grocery shopping is becoming an increasingly more important retail channel, especially in urban centers [7]. Thus, in-store interventions in traditional brick-and-mortar stores and online supermarkets are crucial instruments for policymakers aiming to alter consumers' sustainable food choices [8]. Here, they can draw on a variety of different intervention types ranging from economic interventions (e.g., taxes) to changes in a store's microenvironment (e.g., choice architecture techniques; see [9] for an overview).
Due to this variety, evidence-based policymaking has a constant need for data to identify the right intervention type or a mix of interventions for the respective case. For instance, due to a lack of empirical evidence, the effectiveness of changes in a store's microenvironment to promote sustainable food choices is still questioned [10,11]. Furthermore, little is known about the extent to which findings from traditional brick-and-mortar stores can be transferred to online supermarkets [7,12]. For instance, there is initial evidence that online supermarkets should not be regarded as perfect mirrors of their real-world equivalents. While some elements of the shopping environment like shelf placement strategies seem to be a relevant factor for both online and "offline" supermarkets [13,14], both channels differ in aspects like (i) in product presentation, e.g., physical vs. virtual [15], (ii) navigation pathways [16], or interpersonal interactions [17,18]. Moreover, compared to physical contexts, online environments allow an easier, faster, and more flexible integration of different design choices and interventions and provide enhanced functionalities like decision support systems [19].
Hence, as part of a transition to a sustainable food system, it is mandatory to gain further insights into the determinants of consumers' in-store behavior patterns and food choices in online supermarkets and traditional brick-and-mortar stores. As one of the first, a survey study in Poland analyzed determinants and barriers of organic online shopping [20]. Only in this way, it will be possible to evaluate the current legal requirements (e.g., packing information) on their effectiveness in analog and digital choice environments and modify them if necessary. Nevertheless, empirical insights can support policymakers in developing, testing, and adopting new policies to govern sustainable food choices [21][22][23].
As it is not always feasible to run studies in actual (online) supermarkets, researchers have recently started to conduct studies in simulated virtual supermarkets [24][25][26]. Such an approach has the potential to heighten the realism level of consumer experiments and allow researchers to collect valid purchase data at a relatively low cost.
To make computer-simulated shopping experiments as accessible as possible to interested researchers, we developed an open-source, modular, and highly customizable virtual online supermarket application, called VOS. The application allows researchers to easily implement and perform experiments in the context of (online) grocery shopping. Thus, it can help to develop and evaluate policy measures aiming to support more sustainable food choices. All it takes is a server computer (e.g., a cloud server) to host the experiment and participants with access to a device using a modern web browser.
The tool's front-end was designed to emulate the store design and functions (e.g., navigation tools) of a realistic online grocer environment. In addition, the back-end of the tool allows researchers to modify the research conditions and to configure and implement different experimental treatments. For universal access, the project's source code, python scripts for automated treatment administration, and configuration snippets for local and server hosting are available on GitHub [27]. Moreover, user documentation, set-up instructions, and sample data are available in conjunction with the repository. Everyone should feel invited to use a VOS for academic purposes and improve our application and submit it via GitHub. However, we ask to cite this paper if the VOS is used for your publication.
The technical barriers for using our application were lowered so the product database and a couple of research conditions can be edited without any programming knowledge by using a visual administration interface (VAI). For instance, researchers can use predefined modification options ("Use Cases"; UC) to adjust food prices, implement different labeling strategies, or change the arrangement of food items. The aim was to make a preselection of modification options covering the broadest possible spectrum of varying research interests. Hence, use cases range from traditional economic instruments (e.g., taxes) over knowledgebased interventions (e.g., labeling strategies) to choice architecture and decision support tools. Researchers familiar with programming in Angular [28] are not limited to basic features. Instead, they can extend or change any aspect of the tool's visual or functional implementation by altering the program's code. In total, our application allows researchers to test a broad spectrum of interventions on consumers' food purchases (i) at relatively low cost, (ii) without a complex implementation process, and (iii) without having to collaborate with a specific retailer. In addition to evaluating such interventions based on outcome variables like purchase data, the tool offers extensive possibilities for recording subjects' in-store behavior dynamics. This way, a VOS generates both general information about users' in-store behavior (e.g., the average use of filters) and detailed information about a single user's "journey" (e.g., single user's navigation pathway) during the shopping trip.
In this paper, we will first introduce the development, features, and implementation of our application. With this, we pay particular attention to describing opportunities for practical implementation. Secondly, we will present the results of an initial evaluation study on the tool's functionality and realism level.

VOS in a Nutshell: Features and Functions
The VOS application was designed to enable researchers to conduct experiments in a realistic online shopping environment. It contains two core elements: The shop view and a visual administration interface (VAI). For more information about the technology and programming of the VOS application, see Appendix A.

Shop View
The shop view mimics the appearance and functionality of a realistic online supermarket. The design is not prohibitive and provides users with the services expected of an online store. Users can browse for products without any (visible) restrictions, e.g., by clicking on categories and subcategories or using a search bar for word searches ( Figure 1). Participants see a list of all food items belonging to the selected category (or subcategory) on category pages. In this list, product names, product images, price, the price per unit (e.g., EUR 1.00/kilogram), and the package sizes (e.g., 100 g) are displayed. Researchers are free to decide which currency and quantity units (e.g., gram vs. ounces) are used for the shop. Clicking on a food item takes consumers to a product page, where more detailed information on the product is available (e.g., nutritional information). Further, our tool's base version provides category, attribute, and open text filter options for filtering items. In addition, the items can be sorted via ascending or descending prices, and a virtual shopping cart (VSC) lets users add items to and delete them from their cart. In the tool's base version, the front-end design is very neutral to prevent any existing customer relationship bias towards the store's design elements. However, as the application code is open-source, the design can be customized to suit researchers' individual needs. The same applies to most of the elements, functions, and information displayed in the shop. As explained in the next section, researchers can use the visual administration interface's support for some of these modifications.

Visual Administration Interface (VAI)
The VAI is a visual subdomain where each authenticated user (administrator) has their own workspace ( Figure 2). It provides the means for managing the product database and various modification options that allow researchers to determine which information or functions the shop view presents. In the following, we refer to these predefined modification options as "Use Cases" because researchers can use and modify them to create their own experimental treatments. For instance, without any mandatory knowledge in programming, it is possible to create or edit the items, filter mechanisms, taxes, labels, scores, configure VCS's functions, and implement swap interventions. Swap options offer consumers the opportunity to replace a selected food item, for instance, with a healthier or more sustainable one [25]. Furthermore, researchers can limit participants' shopping budget for each treatment. This might be particularly relevant for spending tracking experiments [29]. In this manner, it can be safeguarded that subjects invest reasonable effort into the shopping task and do not merely click through it. Here, just one click is necessary for immediately testing all created treatments in a demo mode. Via the VAI, it is also possible to administrate participants and to conduct experiments. Here, one can easily create (personal) links for all participants. These links refer participants to a unique URL, which presents the SV modified to the assigned treatment conditions. On this webpage, the VOS records participants' behavior automatically and saves it on the employed server. Hence, a VOS solely requires the link, a connection to the hosting server, and a modern web browser for participating in the experiment.

Preconfigured Modification Options ("Use Cases")
The application's base version includes five use cases (UC.1 to UC.5). These cover four intervention types: (i) economic interventions (UC.1 taxes and subsidies), (ii) changes to the store's microenvironment (UC.2 product arrangement), (iii) knowledge-based interventions (UC.3 product labeling and scores), and (iv) two decision-support tools (UC.4 different types of VSCs and UC.5 swap options). All use cases were selected and designed based on research findings on in-store interventions in (online) supermarkets (e.g., [8,[30][31][32][33]). In addition, we conducted expert interviews with researchers from different disciplines to identify and prioritize their expectations and requirements for a research tool.

Use Case One: Taxes and Subsidies
In many cases, the price of a food product is the central criterion underlying the purchase decision. Hence, it does not seem very surprising that several studies analyze how economic interventions alter consumers' food choices, for instance, by applying taxes, subsidies, or other monetary incentives (e.g., discounts and food coupons) on selected food items (see [34] for a review).
Overall, there is empirical evidence that monetary incentives effectively alter consumers' food purchases and consumption (e.g., [8,35]). With this, salience seems to be an important determinant for the effectiveness of a tax or a subsidy [36]. This might be particularly relevant for online supermarkets as individuals tend to shift towards more shallow forms of information processing in digital environments such as browsing, scanning, or skimming [37][38][39]. Hence, an individual's actual food choice in an online supermarket could be based more on a superficial first visual impression like the final price than consciously considering different internal and external aspects like health, price, or convenience [40]. Accordingly, the effect of taxes in an online supermarket could also depend on their visual presentation and salience.
Consequently, the need for empirical evidence for evidence-based policy-making prevails for at least two reasons: first, food prices remain a politically controversial topic, especially against the background of debates on sustainability and animal welfare (e.g., [41]). Second, little particular research on the effects of taxes in online supermarkets exists.
To this end, a VOS offers researchers an easy way to create their own or implement existing taxation models (e.g., meat tax) and evaluate their effect on consumers in a realistic online shopping environment. When defining an experimental treatment, researchers can configure whether and for which products taxes should be displayed and charged in the shop. Additionally, further information on the tax can be provided. Consumers will find this information on the product page of each item affected by the tax (Figure 3). Here, the VOS automatically tracks whether a consumer has retrieved this information or not.
Despite being labeled "taxes", the function is not limited to taxation. Instead, it can also be used for other pricing mechanisms and strategies like subsidies, discount models, or dynamic pricing. In particular, the latter approach is coming into focus as recent technological advances have opened up unprecedented opportunities for retailers to implement interpersonal pricing strategies in a more flexible and prevalent manner (e.g., based on consumers' individual characteristics).

Use Case Two: Product Arrangement
Previous research has demonstrated that even marginal and seemingly irrelevant changes in a store's microenvironment can alter individuals' food choices in a predictable way, e.g., a more "prominent" positioning of a food item [42]. This resonates in the wellknown phrase "Eye-Level Is Buy-Level". Indeed, the impact of these and other shelf placement strategies on consumers' purchases are extensively documented in marketing literature (e.g., [14]). However, whether these interventions can also be successfully used to promote better food choices has not been completely evaluated [43]. Even if product presentation methods differ between traditional brick-and-mortar stores and online supermarkets, (digital) shelf placement seems to remain a relevant factor for online stores. Overall, initial findings suggest that small differences in assortment organization affect consumers' behavior similarly when tested in an online store [13,14]. However, there are still a couple of unresolved research questions that concern both physical and online supermarkets. For instance, little is known about how product bundling strategies (e.g., Organic Box Schemes) [44,45] or store-in-store concepts affect consumers' sustainable food choices. Even if some shelf management strategies show similar effects on consumers' behavior for both retail channels, it is much easier to change or even customize shelf or product arrangements within an online supermarket [44,45]. Hence, future research may benefit from a systematic evaluation of how customized digital shelf placement strategies affect consumers' food choices.
Accordingly, findings obtained in computer-simulated shopping experiments can provide generalizable insights. Thus, the VOS is suitable for both: studies aiming to gather general findings on how organizing an assortment affects food choices and analyzing particularities of absolute and relative display locations in online supermarkets. For product sorting and bundling, the application offers two kinds of filter types: Filters based on tags (e.g., name of a product category) and filters based on attributes (e.g., organic). For each item, tags and attributes can be defined directly in the product database ( Figure 4).
In this case, we created a filter tree for tags with one parent category and one child category, displayed automatically on the left-hand side of the shop view. In the same manner, a filter based on items' base attributes will be automatically implemented. This can be used to limit the selection, e.g., only including items with the trait "organic". Figure 5 shows a possible implementation.  The base attribute filter is a multiple-selection whereas the tag filter requires a single choice. Alternatively, researchers can create filter trees in the VAI. This way, there are no limits to the filter tree's customizability and depth (see the VOS User Guide for further instructions on custom filtering). Further, researchers can use an additional sorting mechanism: "niceness."". Niceness is an attribute on the item level, represented by a numeric value between zero and one. Based on this attribute, items from the database are sorted, beginning with the least nice items. This ordering mechanism allows individual researchers to define which items always are displayed first (see Figure 4).
Using these functions, researchers can edit (i) how products are arranged into categories and subcategories, (ii) how these categories are titled in the store, and (iii) which products are displayed in which order. Furthermore, they can use it to implement shop-inshop concepts (e.g., sustainable food substore).

Use Case Three: Product Labels and Scores
Besides interventions geared towards human affection, cognitively-oriented interventions aiming to transfer knowledge and information remain widely used in food policies [33]. Prior research has illustrated that grocery shoppers do not pay much attention to detailed product information like nutritional information [45,46] but base their decisions on key information that is readily available and processable [47,48]. Consequently, simplified and salient front-of-package label formats may facilitate consumers' processing of nutritional information and product attributes (e.g., organic) at the point of sale [49].
Based on these empirical insights, many different food labels have been developed, empirically tested, and in some cases, launched [50][51][52][53]. However, the empirical evidence on the effectiveness of labeling strategies to promote more sustainable and healthier food choices is overall inconclusive and inconsistent on the question of which label format works best [33,54]. Consequently, there are numerous starting points for further research, such as (i) evaluating new labeling strategies (e.g., for animal welfare-friendly standards), (ii) analyzing the effectiveness of existing labels in visual online shops, or (iii) exploring the impact of the above-mentioned new technological possibilities in online supermarkets.
A VOS can support this type of research by allowing researchers to create a new binary or multi-level food labeling schemes and scores. These can be assigned to specific store items directly in the interface. Further, legal definitions or other explanations can be provided for each claim on the food label or score ( Figure 6). Moreover, all product information and the nutrition facts labels (nutrition information panel) for each item in the database can be edited via the VAI. Shoppers can access this information with just one click in shop view, and it is automatically recorded whether they have done so or not. This might be interesting because empirical evidence points out that some consumers willfully ignore information if it helps them avoid an inner conflict (e.g., animal welfare concerns vs. meat consumption; [55]). Moreover, those labels that guarantee only low sustainability standards or animal welfare might bias consumers to overrate the actual product quality in favor of these two aspects [56]. This has been referred to as the label halo effect [57].

Use Case Four: Virtual Shopping Carts
In contrast to traditional brick-and-mortar stores, consumers cannot physically interact with a salesperson in an online store. To compensate for this, vendors have already started implementing a wide variety of decision-support tools within their online shops. These tools aim to assist shoppers with their purchase; for instance, by providing information on demand (e.g., via a search-bar or a chatbot), (ii) providing real-time feedback (e.g., about spending), or (iii) present personalized product recommendations. According to Häubl and Trifts [58], the way in which consumers search for product information and make purchase decisions is always a result of the sum of all the single interactions with different decision-support tools available in an online shopping environment.
One decision-support tool and at the same time an integral part of every online store is a VSC. Even the most simplistic VSC enables consumers to accumulate all want-tobuy products in a list and provide real-time feedback about the price of their goods. In contrast to a shopping trip without decision support (e.g., in a physical store), this can increase consumers' total spending and the spending for higher-priced, hedonic, or organic products [59,60]. Moreover, VSC's lower transaction costs make it relatively convenient for consumers to adjust their current shopping cart by adding (or removing) single items or making changes in product quantity at any time and without much effort [61]. However, to the best of our knowledge, no research exists that analyzes the impact of different VSC designs on consumers' food choices and in-store behavior.
To capture this, VOS allows researchers to choose between three different types of VSCs: (i) "Icon Only", (ii) "Icon Plus", and (iii) "Pop-Up" (Figure 7). They vary in how many "clicks" a user has to invest in reviewing their present spending and adding, removing, or replacing items. For the first type (Icon Only), only a small dynamic cart icon is displayed in the upper right corner of the header on category pages and product pages. This icon provides three basic functionalities: (i) notifies them when a new item is added to the shopping cart, (ii) shows the number of total items in the cart, and (iii) provides a link to a separate fullpage cart. Only on the full-page cart can consumers find all the details of their transactions and edit their cart before continuing to shop or proceeding to the checkout page. The second type (Icon Plus) is identical to the previous version, except that the dynamic cart icon additionally shows the total spending. The third type (Pop-Up) includes a mini shopping cart, which will be displayed if users hover their mouse above the shopping cart icon. It is designed to provide users with a compact version of the main shopping cart page with all information and functionalities while keeping them on the product pages to continue shopping. Via the treatment configuration functions in the admin view, researchers can choose which VSC type they want to implement for which experimental treatment.

Use Case Five: Swap Options
We decided to include swaps as a second use case for decision-support tools because they represent a highly transparent and, at the same time, effective recommendation agent. On the one hand, users can easily identify when, where (type-transparency), how, and for what purpose (token-transparency) swap interventions were used [62,63]. On the other hand, various empirical studies document a statically significant effect of swap interventions on consumers' food purchases in real and virtual supermarket settings [64][65][66].
However, there are still many open questions on the impact of swap interventions [66]. For instance, it is unknown which product categories and attributes consumers are more likely to accept swaps. In particular, there is no study to examine how effective this type of intervention is in promoting the sale of products with credence attributes like animal welfare or sustainability. Moreover, more evidence is needed to determine if swap interventions are more effective when presented immediately (e.g., after clicking on the "Add to Cart" button) or when bundled at checkout [25]. The same applies to the degree of freedom of choice that consumers should be given. For instance, should swaps options be automatically displayed per default, or should consumers be obliged to determine if and when swaps are displayed (forced-choice)?
To address these and similar research questions, VOS provides swap options for food items at different points of the shopping trip: (i) when adding items to the shopping cart or (ii) when finishing the shopping trip by checking out (see Figure 8).
In each case, a pop-up notification informs participants that a swap option is available for the chosen product(s) (swap dialog). In addition, it can also be left for participants to decide whether they want to receive swap options or not (forced-choice). In this case, the swap dialog does not automatically provide a swap option. Instead, participants can determine whether they want to (i) see swap options if available or (ii) not to receive future notifications about swap options. Again, researchers can configure these aspects directly in the VAI. Unfortunately, so far, the VAI cannot define which swap options should be shown for which item. Instead, swaps must be assigned in the original product database on which the application is based. You can find more information about this procedure in the VOS User Guide.

Data Recording: Behavioral Outputs and Instore Behavioral Dynamics
The data's scope automatically recorded by the application includes many outcome variables that cover participants' purchases by default. However, by focusing on these outputs, in-store decision-making happens largely within a black box [67,68]. Hence, researchers interested in more in-depth insights about subjects' shopping also need data on in-store behavioral dynamics. However, this kind of data is often not available or has to be reconstructed laboriously, e.g., by using complex procedures like video screencapturing [69]. The VOS records several actions taken by a subject during an experimental shopping trip automatically in code to counteract this. This includes the navigation path (routing), filtering or sorting options used, all actions taken on a page (pagination events), and all additions, changes, and removals to and from the shopping cart. This data can later be converted into variables and thus used for statistical analysis. In this manner, we receive (among others) information about a shopper's (i) shopping duration, (ii) first orientation time on the website, (iii) the add-purchase-ratio for items in the VSC, or (iv) the number of detailed product views. Furthermore, some use case-specific data is recorded. For example, for "Sustainable Swap Interventions", data is recorded on (i) which item triggered a swap dialogue, (ii) the time when a swap dialogue starts/ends and (iii) whether a swap option offered was accepted or rejected. As each action has a unique timestamp, instore-behavior data can be used for aggregate-level analyses and in-depth analysis like tracking a single user's behavior on the website. All data can be downloaded from the server in JavaScript Object Notation format and reformatted in any number of ways. Example scripts for extracting and reformatting the data can be found in the VOS User Guide.
In addition to behavioral data, it is possible to obtain further self-reported data by including configurable questionnaires. These can be positioned before and/or after the actual shopping task. Thus, they can function as screening questionnaires, comprehension checks, manipulation checks, or post-experimental questionnaires.

Using a VOS: Results from a Pilot Study
This section describes the results gained from a pilot study aiming to evaluate the app's technical functionality and gain feedback about users' shopping experience. For this study, we used the base version of our application, which was populated with a representative stock of food-items. Subjects in our study were free to choose between 6619 food items from seven general categories. However, this product database comes with a few drawbacks, especially compared to the offline shopping experience. This includes an over-representation of certain food categories. For instance, convenience foods and goods with long shelf lives are more commonly available in online supermarkets.
Since it is the application's primary objective to enable researchers to conduct meaningful experiments, it needs to be able to generate genuine user data. Hence, this study should also verify if users have the sensation of interacting with a real online shopping environment. Additionally, it aimed to generate feedback for further refinement and development of the online shopping experience. Finally, yet notably, the pilot study served as a showcase to illustrate which data can be obtained from using the VOS and how it can be analyzed. The design and results of this pilot study are described below.

Procedure
The study was conducted online over ten days in December 2019. Subjects were recruited from a pool of students and university staff members. In total, the VOS was intensively tested among 29 people. All subjects had to complete the same shopping task. In particular, they were asked to shop for groceries to cover their household's needs for one week. Subjects were prompted to select items and amounts similar to their actual grocery shopping behavior. There were few restrictions except that subjects were obliged to shop at least 12 unique items to complete the task. This was imposed for two reasons: (i) 12 items is the average number of products bought per supermarket trip (e.g., [70]) and (ii) we wanted to ensure that participants are interacting with the application seriously. Yet, subjects in our study did not actually purchase the selected items. Nonetheless, previous studies have illustrated that even hypothetical purchasing scenarios can provide pertinent data, in particular when focusing on in-store behavior rather than on analyzing the final VSC [71,72]. After subjects completed this task, they were asked to evaluate their shopping experience with a questionnaire. Besides, basic demographical facts, individuals' grocery shopping habits, and basic economic data were queried.

Results and Discussion
None of our subjects reported any major technical problems with the VOS, and all data was accurately sent to our server. The subjects covered a wide age range from 18 up to 54 years (M = 28.76). Most subjects had a high level of education (this criterion was fulfilled if subjects indicated to hold at least a high school diploma) (75.9 percent), were single (51.7 percent), and lived in their own apartment (62.1 percent). This suggests that most subjects were responsible for grocery shopping in their households. Indeed, on average, they went grocery shopping 2.55 times per week and spent €30.10 per shopping trip.

User Experience
Further, we used four constructs to measure users' experience with the VOS regarding their (i) general satisfaction (GS) with the application, (ii) its information quality (IQ), (iii) its system quality (SQ), and (iv) its realism level (RL). Table 1 provides an overview of all subconstructs belonging to these constructs and the associated descriptive statistics (for a complete overview of all questionnaire items used, see Appendix B). All items were measured using a seven-point Likert scale.
Overall, subjects were satisfied with their shopping experience in our online supermarket (GS.01 mean : 4.89), and their expectations of the application were mainly met (GS.02 mean : 4.96). In addition, the subjects agreed rather than disagreed on average with the statement "I would buy groceries in a real online supermarket, similar to the VOS" (GS.03mean: 4.93).
Concerning the information quality, subjects predominantly rated the understandability (USS mean : = 5.68), reliability (RS mean : = 5.73), and usefulness (UFS mean : = 5.02) of the available product information positively. To measure user-friendliness, we asked subjects to rate whether or not (SQ.04) the store's layout is simple, (SQ.05) it is easy to use, (SQ.06) it is well organized, (SQ.07) it is possible to see as many products as possible at a glance, (SQ.08) it is possible to easily compare different products, (SQ.09) multiple product images and display formats are available (e.g., zoomed images), (SQ.10) its design is straightforward, and (SQ.11) it is user-friendly in general. To evaluate the site's navigation, we asked subjects to rate whether or not (SQ.12) it has made it possible to (SQ.12) easily go back and forth between pages, (SQ.13) locate the information they need with just a few clicks, (SQ.14) locate the products they prefer as quickly as possible, (SQ.15) edit their shopping cart as quickly and easily as possible (e.g., to add products), and (SQ.16) easily navigate.

D. Realism Level Score (RL)
To measure the realism level of our online shopping simulation, we asked subjects whether or not (RL.01) VOS has given them the feeling of using a real online store, (RL.02) their purchases correspond to their regular shopping behavior, (RL.03) their decisions reflect their regular in-store behavior (e.g., product comparisons), and (RL.04) their gathered information while shopping reflects their behavior on a regular shopping trip.

5.50 0.94
Notes. Each score was calculated as an unweighted sum index from the items underlying the respective construct.
The subjects were satisfied with information quality and system quality (SQ mean : 5.43). In particular, subjects rate usability of our tool positively (USAmean: 5.80). For instance, they perceived good responsiveness (SQ.01 mean : 6.00) and fast loading times (SQ.02 mean : 6.04). Additionally, the tool's user-friendliness was evaluated positively (UFS mean : 5.22).
For instance, subjects stated that the shop had a straightforward design (SQ.10 mean : 5.61), a simple layout (SQ.04 mean : 5.7), where all functions could be found as expected (SQ.06 mean : 5.71) and was easy to use (SQ.05 mean : 5.96). In sum, the feedback indicates that the design and structure of the shopping user interface (UI) successfully guided our users. In this article, we refer to the term user interface (UI) to describe the graphical environment (shop view) that allows users to interact with an online shop.
However, comments in a free text field yielded valuable hints at what users perceived to impede their shopping experience. The main criticism referred to the free-text search bar at the top of the page because only specific terms yielded the desired results (e.g., baked beans vs. beans). This was mainly due to the information quality of the item pool. Here, the free-text search's current implementation was based on pattern matching of the search term in the item name and brand attributes. For a more refined product search experience, well-annotated item data would be necessary. Secondly, several participants had issues with the absence of components usually featured in online shops. For example, they mentioned missing a landing page, remarked an unusual color scheme for a grocery store, and the absence of advertising, upselling, and product suggestions. These are valid points of criticism, but they were deliberately not included in the project's scope. These features are broad and can be implemented in any fashion. We decided against having these in the base version of the application because their functionality is specific and not easily configurable.
Despite these limitations, subjects reported that VOS gave them the feeling of using a real online store (RL0.1 mean : 5.32). Further, they stated that their purchases (RL.02 mean : 5.61) and in-store behavior (e.g., use of information) were mostly in line with their usual shopping habits (RL.03 mean : 4.93; RL.04 mean : 5.29). Hence, the average realism level score of 5.28 points indicates that a VOS can convey an experience similar to a real online supermarket and can generate meaningful data about customer behavior exhibited. The next section shows exemplarily how this data might be analyzed.

Purchase Data
Besides user experience, we were also interested in finding out whether the application is able to provide a usable data structure for analyzing participants' purchases. For this purpose, participants' final shopping cart was recorded. Table 2 provides an overview of the variables we used exemplary for our study. Presented variables are only a selection of possible outcome-variables that might be calculated from the recorded data. For instance, as evident from the table, we focused on organically produced foods and store brands. However, these variables can easily be adapted for other product attributes like vegan, animal welfare, local origin, and much more.
It must be kept in mind that we primarily wanted to evaluate the functionalities of our application. Due to the small number of participants, no generalizable statements about shopping behavior in an online supermarket can be derived from this pilot study. Moreover, for the same reason, our data is biased by outliers. For this reason, we opted to report the median instead of the mean in the following.
However, in total, 788 items have been purchased, which is an indication that the VOS was tested extensively. On median average, subjects purchased 13 different food items. Moreover, total spending (median: €33.03) was relatively close to most subjects' self-reported expenses for groceries per shopping trip (median: €25.00). Together with the median average basket size (18 items), subjects showed realistic shopping behavior. This impression coincides with subjects' self-reported realism level presented in the previous section. The share of organic items in total purchases was relatively high (median: 26.67 percent) compared to the market share of organic food in Germany (11.97 percent in 2019; see [73]). Further, the majority of purchased organic food items were store brands (median: 85.00 percent). This is not surprising, as price premiums for organic food are high, and store brands can offer cheaper alternatives to other organic food brands.

In-Store Behavior
We used data about participants' interactions with the website to derive several variables that helped us analyze participants' in-store behavior dynamics while shopping online for groceries. Table 3 provides an overview of all considered variables and how these were conceptualized. The selection of variables presented here is based on our own research interests. Of course, researchers are not limited to this set of variables but can adapt or extend it to their own research question.
As evident in the table, subjects spent on median an average of 6 min and 19 s shopping in our virtual online supermarket. Hence, shopping was faster than previous studies, indicating where estimates for average (online) supermarket shopping durations ranged between 13 and 40 min [7,67,70]. The average number of total user interactions was relatively low (median: 47.00), and it took subjects on median an average of just 3.62 interactions on the website to purchase an item. Consequently, many selections were made without having viewed many different pages beforehand.
Moreover, most subjects did not or only slightly adjusted their VSC during their shopping trip. In particular, items were rarely removed or edited once they had been added to the shopping cart (e.g., change in amount). Consequently, the add-to-purchase ratio was almost one on average (mean: 0.96), respectively, the median average. Further, the cart-to-detail ratio (median: 0.14) is an indicator that subjects have not viewed the detailed product pages for the majority of items selected. Hence, it can be stated that most subjects in this sample showed a straightforward shopping behavior. They were not interested in extensively browsing and comparing different products in our online supermarket. Instead, purchasing decisions were made relatively quickly in most cases, without looking at individual items in more detail. This behavior is consistent with shopping patterns found in brick-and-mortar supermarkets (e.g., [74,75]). On median average, after logging onto our virtual online supermarket, it took subjects less than one minute to add their first item to the cart. In addition to subjects' self-reported user experience, we considered this another indicator for good usability and user-friendliness of our tool. Notes. * We had to use the subjects' last visit to the VSC full-screen site as a proxy for completing the purchase since we did not record when the final selected shopping cart was confirmed. However, this shortcoming can be easily fixed for future studies.
Overall, data from this pilot study suggests that subjects exhibited a reasonable purchasing behavior and the application gave them the impression of shopping at a real online supermarket. This indicates that the VOS can convey an experience similar to a real online grocery store, thereby generating meaningful data about its customers' behavior. Moreover, it recorded all actions taken by the individual subjects and provided this data as expected. Hence, VOS supplied the necessary means for conducting our study.
While users evaluated their experience predominantly positively, feedback also suggests areas for improvement: Regarding system quality, future research projects may improve query and data retrieval between front-and back-end. The free-text search bar, which currently only operates on searching the queried term in the product's name or brand, should be extended to provide higher versatility. Furthermore, researchers may direct their attention towards augmenting the shopping experience, e.g., by implementing a dedicated landing page where subjects start their shopping task. Similar considerations apply to product pages. In addition, the user experience may benefit from viewing multiple product images, choosing between different display formats (e.g., zoom), or having complementary media such as product videos available.

Conclusions
This paper introduced the VOS, an open-source web-based application designed to run computer-simulated shopping experiments. While its front-end emulates a modern online supermarket's design and functions, its back-end provides a visual administration interface that enables researchers to create and modify different experimental conditions easily. This feature makes VOS a modular, highly customizable, and usable research tool for analyzing consumer behavior in (online) supermarkets. Researchers can build on several preconfigured use cases to implement and test different in-store interventions, including economic incentives and choice architecture techniques. The base version already includes possibilities to configure (i) taxes and subsidies, (ii) product arrangement and placement, (iii) product labeling, and to choose between different (iv) types of VSCs and (v) swap options. Further, we present the results from a pilot study, which showed that the functionalities of our tool have been working and subjects evaluated their shopping experience to be realistic and user-friendly.
We considered the VOS to be a useful tool for testing a broad range of policy interventions in a realistic online shopping environment. This can be done at relatively low cost, without a complex implementation process, and without collaborating with a specific retailer. Hence, the VOS offers researchers the opportunity to heighten the realism level of their experimental designs. Moreover, vast options for automatic data recording allow analyzing purchases and bringing light into the black box of consumers' in-store decision-making. In further analyzing the particularities of digital interactions in online grocery, scholars could also shed further light on ethical strings attached to influencing users and customers in digital environments [76].
In summary, we hope that our tool will support researchers in exploring key issues that might be conducive to understanding and promoting sustainable consumption. Policy makers and industrial stakeholders can benefit from knowledge that informs the design of future policy interventions, (in-store) marketing communication, product packaging, or the shop design of (online) supermarkets in general.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Brief Information about Technology and Programming
Appendix A. 1

. Application Development Frameworks
Our application is built upon two development frameworks to decrease development time, reduce maintenance cost, and quality assurance reasons. For the back-end, Node.js is used to make a web-based service, provide an Application Programming Interface (API) for handling requests, and run Create Read Update Delete (CRUD) operations on the database. Second, for implementing the web, the front-end Angular version 8 is used. Both frameworks are based on JavaScript.

Appendix A.2. Model View Controller (MVC)
The implementation style of our application follows the model view controller (MVC) concept. Working with MVC in web application development is different from conventional application development. The architecture has to be partitioned between the client and the server-side. The client-side always handles a web application's view, but the model and controller can be partitioned in various ways between the client and server. Hence, a compelling architecture would rely exclusively on the server to refresh the client's screen. In this case, the model and the view-generating logic for the client's browser would reside entirely on the server. Moreover, the controller would partially reside on the client (detecting user interaction) but mostly reside on the server (code that updates the state of the model's business objects based on a HTTP request). This describes a thin-client approach with the advantages of decreasing the client machine's performance demand and providing greater security, performance, and data consistency for the application. Web application frameworks that reflect this paradigm are Django and ASP.NET.
The other extreme is maintaining the bulk of the application on the client-side (fatclient approach). This means that the model mostly resides on the client-side, but the database remains on the server-side. In particular, the view is exclusively implemented on the client-side, and the controller mostly resides there. This provides a more seamless and interactive experience through fewer load times, minimizing the need to make server calls. Frameworks that support this style of partitioning are AngularJS, EmberJS, and JavaScriptMVC.

Appendix A.3. Front-End Programming
The views are built directly with HTML5 in conjunction with style sheets written in SCSS (Sassy CSS) to provide a visually appealing and user-friendly experience. This combination is supported by most browsers natively, which means a wide range of devices can be supported. In this way, a responsive front-end web design is provided, which is consistent between devices and browsers. This combination allows for the separation of presentation and content, the reduction of repetitive code, flexibility, control of the presentation, and sharing of formats between views. The web-view also utilizes Bootstrap and Angular Material, which are CSS libraries that offer standardized web-content styling and component options. Therefore, developers benefit from the ease of use and accessibility of these frameworks for building visually engaging views. Moreover, they are still able to determine custom styles and layouts centrally.
The front-end codebase is designed and implemented to deliver an accessible experience both to users and developers. Features and design elements are designed to encapsulate specific functionality or business logic. Hence, the code is partitioned into feature modules (which house the model's logic), view, and controller, represented by the components contained. Utilizing this implementation logic makes it easy to tease apart singular design elements, use case implementations, and feature sets for subsequent expansion and development. Singular design elements of the application are thus contained in one subfolder easily recognizable (see Figure A1). Changing the appearance or behavior of such a design feature would mean locating the associated component and altering the source files. These component definitions are divided into (i) the template (*.html file), which handles the display elements, (ii) the (*.sccs file), which defines the styles for this component, and (iii) the (*.ts file), which handles the data CRUD, data binding, and event handling of the specific component. In addition to this, the implementation of our application follows a strict separation of control. Functionality like data handling, CRUD operations, and event recording is centralized and separated from component view logic. Injectable services offer reusable access to functions, which are generally consumed by multiple components. Furthermore, the services are descriptively modeled to offer all functionality connected with a specific data model. For example, CRUD operations connected to the shopping cart component are combined into a shopping-cart.services.ts file. If any view component needs access to the specific functions and data of that topic, it has to inject the shared instance of the service into its constructor, thereby gaining access. This offers organized, reusable, and easy access to all operations needed throughout view components. Even if any implementation details change, these changes need only be applied in one file.

Appendix A.4. Back-End Programming
The back-end is a representational state transfer (REST) API. This exposes the data, functions, and facilitates the interaction between the database and the front-end application. Moreover, it exposes endpoints that respond to client requests in a predictable manner. The web services are stateless, as they do not maintain the state of each client application accessing the web-service, instead offering predefined sets of stateless operations. This allows it to remain independent of the front-end application, meaning that these web services may serve different client applications and can be interacted with or without using the front-end application. This independence offers the advantage that researchers are not limited to using the visual treatment edit interface to interact with and change treatment aspects or items. Scripts can be written that automate treatment creation, modification, and data analysis tasks (examples can be found in the VOS User Guide).
As mentioned above, the back-end is based on the application development frameworks Node.js, an event-driven JavaScript runtime environment that works outside of the browser. This allows for continuous utilization of JavaScript in both application areas and reduces entry barriers for developers, as only knowledge of one programming language is required.
However, using JavaScript for the REST API does not incur performance decreases, as could be generally expected. Node.js is built on the libraries V8 and libuv; these are responsible for partly converting the JavaScript code to C++ code, thereby combining the ease of use attributed to JavaScript and its high performance attributed to C++. It is also highly scalable without threading, instead of utilizing a simplified model of event-driven programming with callbacks to signal the task's completion. However, this essentially single-threaded approach means that the application cannot scale vertically, which means that merely adding computing power to a given system will not directly translate to an increase in application performance. Despite this, it is still capable of scaling by running several concurrent instances of the same application within one cluster manager (cluster mode). This distributes the workload among the available application instances. A production-ready and open-source load balancing software for Node.js applications is already available free of charge; see, for example, PM2.
Moreover, the repository structure of the back-end application is modeled to promote easy access and understandability. Thereby, the subfolder structure represents the data structure utilized throughout the project. For instance, an item and all its associated CRUD operations are contained in one subfolder (see Figure A2): models (*.model.js files) enforce the document structure, routes (*.route.js files) define the request endpoints, and with that access to all the operations that can be performed on the data objects. Additionally, functions collect reusable logic used throughout the route definitions. Middleware functions (*.middleware.js files) provide necessary state information to the otherwise stateless endpoints. This makes the project accessible for further development, as the application structure can be deduced from the repository structure. Effects from changing aspects of the data structure are contained in this subfolder and do not affect the overall application. This makes it easy to add or change and customize the functions, data models, and route specifications implemented in the base application. Figure A2. Sample repository structure as used in the back-end.

Appendix A.5. Availability
The code is developed using Git, a source-code versioning system. This encourages good backup and versioning practices and allows developers to synchronize files across computers, develop collaboratively, manage separate branches, and merge synchronization conflicts. The project is open for other researchers to join, collaborate, or just download the source code on GitHub. Developers will find more detailed information on the VOS programming in the VOS User Guide. Notes: Items have been translated from German. All items have been measured on a 7-point Likert scale (1 = "I do not agree at all" and 7 = "I completely agree").