The Impact of Language and Location on the Curriculum Preferences of Online Entrepreneurship Students: An Application of Google Analytics

: The needs and interests of online learners can be difﬁcult to assess. Large, self-paced, open courses attract learners from different locations, ethnicities, and educational backgrounds. It is critical that instructors and institutions understand the needs and interests of their learners so that curriculum and pedagogy can evolve. In this paper we consider the curriculum preferences of online learners who are accessing asynchronous, self-selected, and self-paced training content related to entrepreneurship. The content is free and is offered in both English and Spanish. We utilize data from Google Analytics, a free application that records critical data about the demographics and behavior or online users, to test hypotheses about the inﬂuence of language and location on the preferences and engagement of learners. We ﬁnd statistically signiﬁcant differences in the interests and engagement of learners to Spanish content as opposed to our English content. Similarly, we ﬁnd that location has a statistically signiﬁcant impact on the curriculum interests and engagement of learners. Using this information, we can design curriculum that is more closely aligned with the interests of our learners and allocate resources to improve pedagogy.


Introduction
In this study, we examine the curriculum preferences and engagement of online entrepreneurship students. We consider the special case of training that is online, asynchronous, self-paced, and competency-based. Using a large, proprietary dataset, we measure the popularity of fifteen different topics to ad hoc users of a comprehensive, well-trafficked website offering free online courses on how to start a business. Each year over 420,000 learners visit the website, select from among various topics, access training on the topics, and depart. The learners represent over 100 different countries, and a variety of ages and languages.
Our goal is to understand the differences in user preferences and engagement that are associated with the language of the training content and location of the learner. These variables were selected for two reasons. First, language and cultural context (often measured by location) have been identified as facilitators, and sometimes barriers, to online learning [1][2][3]. Second, language and location impact the motivation, training, and success of entrepreneurs. For example, some countries have laws that are supportive of entrepreneurs and others do not [4][5][6]. While we are fortunate to have a large sample of data about these variables, we recognize that there are other variables that are likely to impact the preferences and engagement of online learners.
With an understanding of the impact of language and location, we can develop more relevant and effective content that is aligned with the interests of our learners. For example, if we detect that a topic is very important to users from Mexico, but that the users engage with the content for a very short time, we might conclude that the content is poorly presented, out-of-date, or inapplicable to Mexican entrepreneurs. We measure user preferences by how many times a page of content is viewed per year (pageviews) and we measure engagement by the duration of time that the learners spend on a page. These data are available from Google Analytics, a service offered by Google to track and report on website traffic. The website under consideration is hosted by Santa Clara University (www.scu.edu/mobi (accessed on 1 November 2021)) and provides a variety of training options in more than 35 different topics. In 2019, the website, which was originally in English, launched a parallel website that is available in Spanish. The topics and the organization are identical and only differ in the language in which the content is presented.

Literature Review
A considerable body of research regarding the experience of e-learners has emerged over the last two decades. Valverde-Berrocoso et al. [7] and Rodrigues et al. [8] both provide a thorough review of the literature in this area. Using the categories defined by Sangrà [9] our focus in this study is on a Delivery-System-Oriented e-learning platform, i.e., a platform where the focus is on the accessibility of resources. The platform under consideration closely resembles a massive open online course (MOOC) as defined by Kaplan and Haenlein [10]. The principal difference is that in the online training program studied here students can select the curriculum content that is of most interest to them, without the mediation of an instructor or the need to follow a predefined set of topics. The factors impacting the success of online students is an area of tremendous interest to researchers, instructors, and institutions. For example, Shapiro et al. [11] work on understanding the motivations and barriers facing online students. For MOOC students the most prevalent motivations are knowledge, work, convenience, and personal interest and the most common barrier is lack of time. Similarly, Henderikx et al. [12] classified barriers to achievement by MOOC students and found that technical skills, social context, course design, and support and motivation were the significant barriers to intention achievement. Two key issues related to the success of e-learning programs are the language of the curriculum content and the cultural context of the learners. Language has been found to mediate learning by facilitating the communication of ideas and concepts [13,14]. Similarly, the cultural context of students impacts online learning by defining communication style and rules of behavior [15,16]. As online educators, one of our greatest challenges is building curriculum that recognizes and addresses these mediating factors.
The impact of language and cultural context on e-learning effectiveness is not entirely clear. According to some studies, using English as a common language for massive open online courses (MOOCs) may exclude learners [17]. However, others have shown that language has little impact on the motivation of learners to complete a technical MOOC [2]. MOOC completion was more closely related to intrinsic motivation and self-determination. Alternatively, Liu et al. [18] found that cultural background and country of origin had an impact on MOOC learner behavior-most notably on course activity profiles. Gameel and Wilkins [19] provide a clear connection between cultural context and MOOC learner potential and achievement. They find that cultural context impacts digital skill capacity, self-efficacy, and locus of control-all of which contribute to the online student's potential for success. Some researchers have attempted to identify the critical success factors for multicultural learning in an online environment [20]. However, Baker et al. [1] posit that the current macro-theories of culture and context is insufficient for making generalizations about computer-based learning.
Online training of entrepreneurs from diverse backgrounds is an area of intense interest to students, instructors, and institutions. In a recent editorial on the topic, Ligouri and Winkler [21] point out that prior to the pandemic, online entrepreneurship education failed to gain widespread adoption (see also [22,23]). Their paper is a "call to action" to collect data on the pedagogical innovations in online training in entrepreneurship. The activities of the My Own Business Institute (MOBI) at Santa Clara University provide access to a unique set of data regarding the interests of aspiring entrepreneurs who are learning online. This paper is the first in a series of research projects designed to understand the online training needs of entrepreneurs with different backgrounds and skills.
Online entrepreneurship training takes many forms including synchronous, asynchronous, self-paced, and blended with in-person meetings. Many studies investigate entrepreneurship curriculum in a formal educational setting like a college or high school. Few studies examine the interests and activities of individuals who are pursuing entrepreneurship training outside of a formal institutional structure. For example, some even conclude that augmented reality and artificial intelligence are needed to simulate a real environment for college learners who are isolated because of the COVID-19 pandemic [24]. Others find that for online and blended models, the most appropriate educational technology depends on the needs of the learners and the goals of the course [25]. Very little work has been done on asynchronous, self-paced online learning since it does not align well with the fixed-term, face-to-face courses that historically dominate educational institutions. This model for training is more commonly used for certification and competency programs-and is the source of the data for this analysis.
The evidence of the effectiveness of online training in entrepreneurship is in short supply, with some notable exceptions. In one study, researchers surveyed several hundred university students to compare online and in-person versions of the same entrepreneurship course and found minimal differences between the two modes of instruction [26]. In a separate study, it was determined that previous experience in entrepreneurship had no impact on the achievement of learning objectives in an asynchronous online entrepreneurship course [27]. A similar research track is focused on the effectiveness of online learning in general-not specifically online training for entrepreneurs. Many of the factors that impact effectiveness are not related to curriculum or pedagogy. For example, researchers have found, not surprisingly, that the technology resources and support of management increased awareness and use of e-learning systems [28]. In addition, critical success factors for cloud-based e-learning solutions include ease of use, network bandwidth, data security, and technological compatibility [29]. The effectiveness of online learning is also affected by student characteristics like readiness, learning capability, motivation, and demographic variables [30,31]. While these studies are valuable for creating effective online and e-learning in technology-rich communities, many questions remain for learners in technology-poor communities.
The best entrepreneurship curriculum in terms of topics and learning objectives is another research area that is closely aligned with our study. In the research we have seen, the conclusions are similar: the best curriculum depends on the demographics, resources, and learning environment of the students. One study considers the differences between working students and non-working students and finds that the experience of these two groups impacts the level of satisfaction with the sufficiency of the training [32]. External support from the student's community is also important in creating an effective learning environment for young entrepreneurs [33]. In a particularly important study, researchers show that adaptation of the curriculum to the specific needs of the learning community is essential for effective learning [34]. This result reinforces the importance of our work because our results will inform adaptation of our online learning content.
Building on this research, we focus on two demographic variables, language and location, and their impact on the popularity and engagement of online entrepreneurship curriculum. By language, we mean the language of the content accessed by the learner, not necessarily the learner's skill in a particular language. For location, we examine eight different regions from which learners access MOBI's training content.
Our hypotheses are: Hypotheses 3. The interests of learners are the same regardless of the learner's location.

Hypotheses 4. The engagement of learners is the same regardless of the location.
Our hope is that the results of this analysis will provide insight into the influence of cultural context and language on the effectiveness of online learning and, especially, online entrepreneurship training. In the next section, we provide a more detailed description of the source of the data and the demographics of the users and learners. The third section describes methods and results, and the final section covers implications and conclusions.

MOBI Curriculum and Data
The data for this study were drawn from the behavior of learners visiting the free online content available from MOBI at Santa Clara University. Visitors to MOBI's website learn in two different ways. First, learners can enroll in any of MOBI's three course offerings: Starting a Business, Business Expansion, and Quick Start Entrepreneur. These learners read pre-selected topics, complete quizzes on each topic, and complete a "final exam" that covers all the material in the course. This course content is delivered through MOBI's learning management system which is accessed by a username and password. Learner progress is tracked, and there are optional supplementary activities throughout the topics for further engagement. Learners who answer 80% of the final exam questions successfully are awarded a certificate from Santa Clara University.
The second way that learners can engage with the MOBI content is by self-selecting topics of interest. These learners are called "ad hoc" learners because they are focused on one or more topics that they access directly on MOBI's website. MOBI has 36 articles covering different aspects of starting and growing a business. To complement these articles, MOBI has videos, blogs, and success stories on its website as well. Once the pages are visited, the learners may continue to browse, sign up for a course, or may exit the website. All MOBI's content is available for free.
Our focus in this analysis is on the behavior and interest of ad hoc learners who are visiting the topics covered in MOBI's Starting a Business course. The Starting a Business course is MOBI's most popular course. It includes 15 different sessions covering topics like financing a business, business insurance, and marketing.
The management of MOBI uses Google Analytics (GA) to track the behavior of website visitors. GA is a free service that records a wide variety of information about website visitors and website pages. For visitors, GA records information like gender, age, country, device used, entry page, and session duration (the time between the arrival and departure of a user to the website). For website pages, GA records the pageviews (number of requests by users to load a page on a browser), average time on page (the average time that user is on a specific page), and whether the page was an entry or exit page, among other things. This information helps managers determine what content is getting visited the most and what content leads to the most engagement with website visitors.
Using GA, we know that between 1 September 2020 and 31 August 2021, 374,269 learners visited the MOBI websites in 433,269 sessions. These learners engaged in 590,547 pageviews of content covering important entrepreneurship topics, blogs, videos, and new articles. About 59.3% of MOBI's ad hoc learners are female and 40.7% are male. The age distribution of the learners is shown in Figure 1.
English and Spanish are the most common primary languages of MOBI users. English is the primary language of 71% of MOBI users and Spanish is the primary language of 22% of MOBI users. Of the remaining 7%, the most common primary languages of users are French, Russian, Turkish, Hindi, and Mandarin. MOBI's website content was created and originally posted in English. In 2019, MOBI launched a parallel website entirely in Spanish.
In the year ending 31 August 2021, about 17% of the pageviews were of Spanish content and 83% were of English content. MOBI's management wants to understand the interests of the learners who prefer the Spanish content and how to ensure that this group is being served well. Interestingly, there is often a difference between the primary language of the user and the language of the content accessed by the user. For example, in the study period, Spanish content pageviews by English speakers was 10,374 (about 11% of total Spanish pageviews) and English content pageviews by Spanish speakers was 13,288 (about 3% of total English pageviews).
of 22% of MOBI users. Of the remaining 7%, the most common primary languages of users are French, Russian, Turkish, Hindi, and Mandarin. MOBI's website content was created and originally posted in English. In 2019, MOBI launched a parallel website entirely in Spanish. In the year ending 31 August, 2021, about 17% of the pageviews were of Spanish content and 83% were of English content. MOBI's management wants to understand the interests of the learners who prefer the Spanish content and how to ensure that this group is being served well. Interestingly, there is often a difference between the primary language of the user and the language of the content accessed by the user. For example, in the study period, Spanish content pageviews by English speakers was 10,374 (about 11% of total Spanish pageviews) and English content pageviews by Spanish speakers was 13,288 (about 3% of total English pageviews).

Methods
Our goals in this study are (1) to measure learner interest in content presented in English as opposed to content presented in Spanish and (2) measure differences in the interest of learners that can be associated with location. Two metrics of website activity are directly related to these goals. First, "pageviews" indicate how many times a page was viewed during a specific period. Pageviews does not adjust for multiple views by the same learner so if one learner visits a page five times, five pageviews are recorded. Second, "average time on page" (ATOP) measures how long a learner spends on a specific page on the website. As with pageviews, ATOP is not adjusted for multiple views by the same learner. Taken together the two measures provide excellent information on the popularity of a topic (represented by a page) and the learner's engagement with the page's content.
One important shortcoming of GA is that while it reports average time on page, the application does not report variances or standard deviations of the user's time on page. Without that information, we cannot make assumptions about the distribution of the time on page and must use nonparametric methods to evaluate this metric.

H1: The Interests of Spanish and English Learners Are the Same
The first hypothesis (H1) is that the interests of learners utilizing Spanish and English content are the same. To test this hypothesis, we recorded the number of pageviews for

Methods
Our goals in this study are (1) to measure learner interest in content presented in English as opposed to content presented in Spanish and (2) measure differences in the interest of learners that can be associated with location. Two metrics of website activity are directly related to these goals. First, "pageviews" indicate how many times a page was viewed during a specific period. Pageviews does not adjust for multiple views by the same learner so if one learner visits a page five times, five pageviews are recorded. Second, "average time on page" (ATOP) measures how long a learner spends on a specific page on the website. As with pageviews, ATOP is not adjusted for multiple views by the same learner. Taken together the two measures provide excellent information on the popularity of a topic (represented by a page) and the learner's engagement with the page's content.
One important shortcoming of GA is that while it reports average time on page, the application does not report variances or standard deviations of the user's time on page. Without that information, we cannot make assumptions about the distribution of the time on page and must use nonparametric methods to evaluate this metric.

H1: The Interests of Spanish and English Learners Are the Same
The first hypothesis (H1) is that the interests of learners utilizing Spanish and English content are the same. To test this hypothesis, we recorded the number of pageviews for the fifteen topics that make up MOBI's Starting a Business course (Table 1). Ad hoc learners select the topic that is most relevant to them and their business, so pageviews are a direct measure of their interest in these topics. As we can see in Table 1, the English and Spanish pageviews are very different.

H2: The Engagement of Spanish and English Learners Is the Same
Once an ad hoc learner selects a content page, ATOP provides a measure of the learner's engagement with the content. In Table 2, we present the average time on page for the 15 sessions that comprise the Starting a Business course. The overall average time on page for the entire MOBI website is 02:35, so we also indicate whether the time on the content page is above the overall average.
Not surprisingly, the amount of time engaged with the content appears to be related to the popularity of the content, as measured by pageviews. For example, learners spend more time on the Communication Tools session (8) which is also the most popular session as measured by pageviews. The less popular topics have lower average times on page and many of the lowest ranked pages in terms of pageviews are also below the overall average time on page. The average time on page is an imperfect measure of engagement because the content pages contain different quantities of text.
As mentioned previously, Google Analytics reports average time on page but does not report variances or standard deviations of the time on page. Since we do not have information about the variance of the time on page or, for that matter, the shape of the distribution of this metric, we rely on a non-parametric test to determine whether the average time on page is different for English and Spanish content. We use a Wilcoxon signed rank sum test to test the null hypothesis that the average time on page for Spanish content is no different than the average time on page for English content. The alternative hypothesis is that the Spanish content average time on page is higher than the average time on page for the English content. Using a significance level of 0.025, we can reject the null hypothesis and conclude that the average time on page is higher for Spanish content than for English content. The rejection of H1 and H2 imply that there are underlying differences in the interests and engagement of users who access Spanish content and learners who access English content. There are many possible reasons for this difference, one of which is the location of the learners. Different regions and countries have different entrepreneurial challenges and these challenges may be driving the interests and engagement of the learners. To explore this possibility a bit more, we recorded the location of the learners accessing the content during the year ending 31 August, 2021. Because location is not always detected by Google Analytics, we only know the location of 91.3% of the ad hoc pageviews of Starting a Business content. Table 3 shows the total pageviews by region and by language. As one would expect, the majority of Spanish pageviews were from Central America, South America, Mexico, and the Caribbean. Given the relatively large number of Spanish speakers in the US and Canada it is not surprising that nearly 10% of the pageviews from these countries were of Spanish content. Similarly, over 15% of the pageviews from Europe were of Spanish content.
To improve our understanding of the engagement of the learners, we recorded the average time on page for English and Spanish content by region in Table 4. We observe the same pattern regarding time on page that we saw when we examined each session individually. That is, average time on page tends to be longer for the Spanish content than for the English content (Mexico appears to be an exception). This may indicate that the translation of the Spanish content is of lower quality and difficult to comprehend, or that there are communication skill differences between the learners accessing the English versus the Spanish content.

H3: The Interests of Learners Is the Same Regardless of the Learner's Location
To test H3 and H4, we examined the content preferences and the engagement of students by region. To simplify the analysis, we restrict our attention to the top three sessions as measured by total pageviews. If we add English and Spanish pageviews together, the top three sessions are Communication Tools (Session 8), The Business Plan (Session 2), and Deciding on a Business (Session 1). Table 5 shows the total pageviews for these sessions identified by region. These three sessions account for 87.9% of all the pageviews of all sessions in the Starting a Business course. A chi-square test of independence allows us to reject the null hypothesis that the learner's region is independent of the learner's topical preference at a significance level of 0.01%. We reject H3, and conclude that the learner's region has an effect on the learner's interests in different topics.
The US and Canada show a pattern that is unlike the other regions. In all the regions, except the US and Canada, Communication Tools is the most popular session. In the US and Canada, however, Deciding on a Business is the most popular session. In addition, the distribution of interest between the top three sessions are far more similar than in other regions, where the session on Communication Tools dominates the interests of the learners.

H4: The Engagment of the Learners Is the Same Regardless of the Learner's Location
To evaluate the impact of region on the engagement of learners, we computed the average time on page for the top three sessions of the Starting a Business course by the region of the learner ( Table 6). The average time on page for all the regions combined shows the most engagement with Session 8: Communication Tools and the least engagement with Session 2: The Business Plan. Learners in the US and Canada exhibit the shortest overall average time on page while learners from Asia (excluding India) show the longest overall average time on page. The engagement of different sessions appears to be impacted by the region. Asia (excluding India) is most engaged with the session on Communication Tools while India is most engaged with the session on Deciding on a Business. Since Google Analytics does not report variances for these statistics, we use a nonparametric test, Kruskal-Wallis, to test H4. The resulting chi-square test statistic has a p-value of less than 0.01 so we reject the null hypothesis that engagement is independent of the learner's region.
The differences in engagement could be the result of the background of the learners in these different regions. For example, if many of the learners have less experience in business, they might be more engaged and interested in the session on Deciding on a Business. If the learners are facing significant language barriers, as the students in Asia (excluding India) might, then they may spend more time on Communication Tools.

Discussion
In this paper, we analyzed the impact of language and location on the preferences and engagement of learners in an asynchronous, self-paced, competency-based, online course. We tested four hypotheses and rejected all four. We conclude that the language of the content (English or Spanish) and the location of the learner have an impact on the topics that are of most interest to learners and on the level of engagement with the content. While the impact of language and location is clear, the reason for this impact is not unambiguous.
The impact of language on preferences and engagement may be due to environmental factors, demographic factors, or both. By environmental factors, we mean local social, economic, or political factors that impact business and entrepreneurs. For example, the topics of interest to learners accessing Spanish content might be related to the environmental conditions in the regions where Spanish is the dominant language-Central America, South America, the Caribbean, and Spain. If this is the case, then we must reconsider our curriculum and explore topics that specifically address the needs of entrepreneurs in these regions. Alternatively, language may be an important factor because of latent demographic variables.
For example, education and language skills might impact engagement-particularly if the communication quality of the written Spanish content is poor. In other words, engagement might be high because it takes longer to comprehend the content, not because the content is more engaging.
The impact of location on preferences and engagement presents similar challenges for interpretation. As with language, environmental variables could be influencing the preferences and engagement with the topics. It is important and interesting to note that in all regions, except North America, Communication Tools is more popular that the Business Plan or Deciding on a Business. In North America, Communication Tools is a strong third place behind the other two topics. There are two possible reasons for this difference. First, it is possible that North American learners are not as advanced as entrepreneurs from other regions. Learners from other regions may be beyond the "Deciding on a Business stage" and are more interested in enhancing communication skills to manage and grow an established business. Second, it is possible that learners from other regions need to enhance their communication skills in order to access and better serve a North American market. Producers from other regions seeking to grow their business in North America would need more information about how to communicate in North American than entrepreneurs who are already in North America.

Conclusions
Online learning has experienced dramatic growth over the last twenty years. This growth has resulted in a variety of new modes of online training and new challenges in measuring the effectiveness of learning in a virtual environment. Understanding the factors influencing the effectiveness of different models for online training is critical for building online training programs that will achieve the learning goals of educators and students. Fortunately, the technology of online training provides new tools for measuring student interests and engagement. For open-access online programs like MOOCs, we can use tools like Google Analytics to track interests and engagement and connect these measures to demographic variables, cultural context, and technology endowment.
In this study, we utilized one of these new tools to evaluate the interests and engagement of an international cohort of hundreds of thousands of online entrepreneurship learners. This study contributes to the literature in at least three ways. First, we provide guidance on how to utilize Google Analytics to measure student interests and engagement for the purposes of improving training. Educators can use this new source of data to develop more effective online courses. Second, we provide new insight into how language impacts the interests of online entrepreneurship students. This insight adds to the work of Barak et al. [2], Altback [17], and others. Third, we provide insight into how cultural context impacts entrepreneurship students' interests in a MOOC-like course, extending the work of Liu et al. [18], Gameel and Wilkins [19], and Gomez-Rey et al. [20].
There are several limitations to the analysis presented here. First, while it is true that language and location impact the interests and engagement of learners, other variables also influence the experience of online students. For example, demographic variables like education, age, income, and gender are potential explanatory variables for online student success. Second, this research only considers two languages for content: English and Spanish. It is likely that content presented in other languages will have an impact on student engagement in a virtual environment. Finally, due to data limitations, this research considers regions comprised of multiple countries and cultures, so one must be cautious about drawing conclusions about how cultural context impacts the experience of online entrepreneurship students.
Future research will seek to gather additional information on environmental variables, demographic variables, and the goals of the entrepreneurship learners. An interesting and important future topic is identifying the relationship between these demographic variables and online student success. We may find that the impact of language and location is related to the gender, age, and education of online students, for example. Another interesting area for future research is the relationship between the content delivery technology and student engagement. There are many different alternative delivery modes for online training, but very little research on which modes are most effective with different populations of learners. Additional research into these and other areas will make it possible for us to adjust our curriculum, improve its effectiveness, and better serve aspiring entrepreneurs in multiple languages from diverse regions.