Linguistic Landscapes on Street-Level Images

: Linguistic landscape research focuses on relationships between written languages in public spaces and the sociodemographic structure of a city. While a great deal of work has been done on the evaluation of linguistic landscapes in di ﬀ erent cities, most of the studies are based on ad-hoc interpretation of data collected from ﬁeldwork. The purpose of this paper is to develop a new methodological framework that combines computer vision and machine learning techniques for assessing the diversity of languages from street-level images. As demonstrated with an analysis of a small Chinese community in Seoul, South Korea, the proposed approach can reveal the spatiotemporal pattern of linguistic variations e ﬀ ectively and provide insights into the demographic composition as well as social changes in the neighborhood. Although the method presented in this work is at a conceptual stage, it has the potential to open new opportunities to conduct linguistic landscape research at a large scale and in a reproducible manner. It is also capable of yielding a more objective description of a linguistic landscape than arbitrary classiﬁcation and interpretation of on-site observations. The proposed approach can be a new direction for the study of linguistic landscapes that builds upon urban analytics methodology, and it will help both geographers and sociolinguists explore and understand our society.


Introduction
The increasing availability of high-resolution geospatial data and the development of advanced methods over the past decade have changed many aspects of research activities, from how we collect, process and analyze data to how we report and interpret the findings. The rapid progress in computer vision and natural language processing allows us to extract information from a variety of unstructured data, such as images, videos, and web pages. Unmanned aerial vehicles, also called drones, have become tools for acquiring precise orthophotos in a timely manner, and machine learning techniques have enabled fast and accurate object detection and classification. The role of visualization is now crucial not only in exploring large-scale data, but also in presenting results from complex analysis.
Urban analytics is a new discipline that emerged in this context. While its definition varies from scholar to scholar, it is essentially a field that studies dynamic urban processes using new forms of data and methods. The analysis of geo-tagged data from social media for identifying retail center locations and catchments [1] may be a case in point. The application of quantitative, computational, and visual methods to less conventional data sets is growing popular in the fields of geography and urban studies, helping us understand and solve various problems in cities more effectively.
This methodological paradigm shift has affected the study of linguistic landscapes to some extent. In general, linguistic landscape research focuses on relationships between written languages in public spaces and the sociodemographic structure of a city. Fieldwork is an important part of the data collection process in many previous studies of this sort: sociolinguists and ethnographers visited a place of interest, took photographs of signage and interviewed business owners, residents, and

Linguistic Landscape
Following the definition of Landry and Bourhis [5], the term linguistic landscape generally refers to all readable features that are visible to people in an area. In most of previous studies, the concept is applied to static objects, such as "public road signs, advertising billboards, street names, place names, commercial shop signs, and public signs on government buildings" (p. 25 in [5]), but nonstationary signage can also be subjects of analysis as in [6][7][8][9]. While interest in languages on signs has been there for a long time [6,10], it has received more attention than ever before in the last two decades. Numerous case studies have been conducted for places around the globe, and relationships between a linguistic landscape and various aspects of society have been investigated.
In multilingual cities, such as Brussels in Belgium and Montreal in Canada, a linguistic landscape and its geographic distribution represent the relative power and status of language groups inhabiting the region. An early survey of Tulp [11] on the languages of commercial billboards in Brussels found that the majority of them were written in French and such tendency was more prominent in the south of the city, where French-speaking people were primarily concentrated. More recent studies reported similar results [12][13][14][15], although there were noticeably more street signs in English for a variety of reasons (see [12] for a detailed discussion on this matter). Aside from the argument of whether or not this is appropriate for maintaining a bilingual environment, the dominance of French billboards in Brussels provided empirical evidence for the link between the linguistic landscape and the strengths of linguistic communities in the territory.
Linguistic landscape research on other bilingual cities also showed similarities to the situation of Brussels. Monnier [16] carried out a survey on outdoor signage in Montreal's commercial sector to monitor its compliance with Québec's language legislation and found that a large proportion of the signs were written exclusively in French. The linguistic landscape of Jerusalem also reflected the demographic composition of the city and power relations among ethnic groups: monolingual Hebrew and bilingual Hebrew-English signs were dominant in Jewish localities, where Arabic was almost invisible, while Hebrew was frequently observed as well even in areas populated by Palestinian Israelis [17,18]. The outcomes from these empirical studies and many others (e.g., the cases of European Union member states [13], Chinatown in New York [19,20], and the Māori language in New Zealand) suggest that the linguistic landscape is not merely a product of the linguistic population makeup, but it is influenced by historical, political, social, and economic motives.
Linguistic landscapes can shed light on significant societal changes, such as the progress in globalization, the increase of immigrants, and the growing importance of tourism. As society becomes more diverse in terms of ethnicity and culture through globalization and immigration, the public tolerance towards the use of foreign languages rises in general, and it leads to linguistic heterogeneity on signage even in monolingual countries. Local communities, whose economy relies on tourism, in particular, are more likely to exhibit bilingual signs, because they "prefer a language understood by potential customer even in cases where the sign writer's skill condition is only partially met" (p. 25 in [6]).
As such, a longitudinal analysis of a city's linguistic landscape can offer a useful insight into the demographic, social, and economic changes occurring in the city. However, as Backaus [21] clearly noted, this type of study requires two or more successive surveys over several years, and it usually demands an enormous amount of time and money. A great deal of the existing literature on this topic is based on field observations and quantitative analysis of pictures, ranging from 20 to several thousands, which undoubtedly need considerable efforts. Furthermore, the selection of survey sites and the categorization of signs in the current approach are mostly a subjective practice [4], so the results may be biased towards hypotheses or preconceptions [22]. As will be illustrated in the rest of the paper, the integrated use of street-level images and computer vision techniques can be a reasonable alternative for exploring linguistic landscapes. Recently, street-level images available on the Internet have become a valuable source of information in the field of urban studies, and they can be used for linguistic landscape research as well.

Street-Level Images
The widespread use of mobile devices and the advances in computing and telecommunication technologies have made it easier to collect, process and share street-level images of various locations. There is now a massive amount of crowdsourced street-level imagery available on social media, and online map service providers offer panoramic views of streets as part of their products [23]. In general, street-level images have higher spatial resolution than aerial or satellite images and show more details from the pedestrian point of view. It contains a tremendous amount of information about space and place, and the utilization of street-level images becomes increasingly popular in the field of urban studies.
The feasibility, advantages, and limitations of using street-level images have been assessed for a wide range of applications. Rundle and his colleagues [24], for example, compared the results from neighborhood audits using in-person measurements and GSV and concluded that street-level images can be a cost-effective, yet reliable method for characterizing neighborhood environments. A number of recent studies have demonstrated that GSV can also be used for the measurement of street greenery in urban areas [25][26][27]: although it may not wholly replace high-resolution satellite imagery or field surveys, GSV and other profile views of urban landscape can represent the distribution of greenery "from the same viewpoint in which citizens experience and see the urban landscape" (p. 93 in [26]). The use of street-level images has illustrated for land use classification [28,29], building classification [30], pedestrian volume estimation [31], and urban canyon mapping [32], and it has the potential to be extended to linguistic landscape research.
To utilize street-level images for the evaluation of linguistic landscapes, two computer vision techniques, namely scene text detection and recognition methods, are essential. There is extensive research being conducted on this subject in the computer science literature, and a good review is provided in [33][34][35]. In general, the identification of typeset text on scanned documents is a relatively straightforward process and achieves high recognition rates [36], especially when the language is correctly prespecified. Offline, handwritten, and multilingual texts tend to be more difficult to read, but the development of neural-network-based methods has resulted in significant progress over the last several years.
Text detection and recognition from natural scene images, such as street-level images, are even more challenging [37]. Natural scene images display many features in varying distances, and the features are not always horizontally aligned. Texts on natural scene images, also known as scene texts, are unpredictable in terms of color, font, size, and orientation, and the background is not necessarily clean. As such, the detection and recognition rates of scene texts are still considerably lower than reading words from scanned documents [38], even with state-of-the-art approaches. However, this field is currently experiencing rapid progress, and thus, the combined use of these computer vision techniques with the existing data collection and visualization methods can be a promising approach to linguistic landscape research.

Study Area
This paper focuses on the linguistic landscape of a small ethnic neighborhood called Garibong-dong in the capital city of South Korea, Seoul. Garibong-dong is one of the 424 administrative zones in Seoul and is located in the southwestern part of the city (i.e., the circled area of Figure 1). As of 2018, there are 190,060 Chinese people living in Seoul, and more than one-third of them are clustered in the two districts, namely Guro-gu and Yeongdeungpo-gu [39]. The study area of this paper, Garibong-dong, is part of Guro-gu and is home to 7332 Chinese people [40].  [30], pedestrian volume estimation [31], and urban canyon mapping [32], and it has the potential to be extended to linguistic landscape research.
To utilize street-level images for the evaluation of linguistic landscapes, two computer vision techniques, namely scene text detection and recognition methods, are essential. There is extensive research being conducted on this subject in the computer science literature, and a good review is provided in [33][34][35]. In general, the identification of typeset text on scanned documents is a relatively straightforward process and achieves high recognition rates [36], especially when the language is correctly prespecified. Offline, handwritten, and multilingual texts tend to be more difficult to read, but the development of neural-network-based methods has resulted in significant progress over the last several years.
Text detection and recognition from natural scene images, such as street-level images, are even more challenging [37]. Natural scene images display many features in varying distances, and the features are not always horizontally aligned. Texts on natural scene images, also known as scene texts, are unpredictable in terms of color, font, size, and orientation, and the background is not necessarily clean. As such, the detection and recognition rates of scene texts are still considerably lower than reading words from scanned documents [38], even with state-of-the-art approaches. However, this field is currently experiencing rapid progress, and thus, the combined use of these computer vision techniques with the existing data collection and visualization methods can be a promising approach to linguistic landscape research.

Study Area
This paper focuses on the linguistic landscape of a small ethnic neighborhood called Garibongdong in the capital city of South Korea, Seoul. Garibong-dong is one of the 424 administrative zones in Seoul and is located in the southwestern part of the city (i.e., the circled area of Figure 1). As of 2018, there are 190,060 Chinese people living in Seoul, and more than one-third of them are clustered in the two districts, namely Guro-gu and Yeongdeungpo-gu [39]. The study area of this paper, Garibong-dong, is part of Guro-gu and is home to 7332 Chinese people [40]. This paper explores the linguistic landscape of Garibong-dong, because it is one of a few places, in which the foreign population comprises an appreciable proportion in the local demographic This paper explores the linguistic landscape of Garibong-dong, because it is one of a few places, in which the foreign population comprises an appreciable proportion in the local demographic composition. Foreigners account for only 2.8% of the total population in Seoul [39], making the city virtually monolingual, so the city's linguistic landscape is mostly dominated by the Korean language [41]. It is certainly not an ideal environment for demonstrating a new methodological framework, so a subarea that exhibits at least some degree of linguistic diversity is carefully chosen instead. Garibong-dong is situated within the most noticeable ethnic community, and the linguistic landscape portrays a mixture of Korean, English, and Chinese. While the linguistic landscape of Garibong-dong may not represent that of the entire city, it can be helpful for validating the proposed approach.

Street-Level Image Collection
This paper uses street-level images available from Kakao Road View, offered by one of the most popular online map services in the country, Kakao Map. While GSV provides street-level images for South Korea, its coverage is relatively limited compared to those of local competitors, Naver Street View and Kakao Road View. The blue overlays in Figure 2a-c indicate roads, where street-level images are accessible on the three platforms, and they suggest that Kakao Road View has the most extensive coverage in the study area. Furthermore, since Kakao Map was the first to start the street-level image service in the country, it has more streetscapes captured from the earliest period, making it suitable for spatiotemporal comparisons of a linguistic landscape. composition. Foreigners account for only 2.8% of the total population in Seoul [39], making the city virtually monolingual, so the city's linguistic landscape is mostly dominated by the Korean language [41]. It is certainly not an ideal environment for demonstrating a new methodological framework, so a subarea that exhibits at least some degree of linguistic diversity is carefully chosen instead. Garibong-dong is situated within the most noticeable ethnic community, and the linguistic landscape portrays a mixture of Korean, English, and Chinese. While the linguistic landscape of Garibong-dong may not represent that of the entire city, it can be helpful for validating the proposed approach.

Street-Level Image Collection
This paper uses street-level images available from Kakao Road View, offered by one of the most popular online map services in the country, Kakao Map. While GSV provides street-level images for South Korea, its coverage is relatively limited compared to those of local competitors, Naver Street View and Kakao Road View. The blue overlays in Figure 2a-c indicate roads, where street-level images are accessible on the three platforms, and they suggest that Kakao Road View has the most extensive coverage in the study area. Furthermore, since Kakao Map was the first to start the streetlevel image service in the country, it has more streetscapes captured from the earliest period, making it suitable for spatiotemporal comparisons of a linguistic landscape. The data collection process was conducted at 709 randomly selected sampling points in the study area. Initially, a random point generation tool implemented in QGIS attempted to create 1000 sampling points along the road network, each of which should be at least 10 meters apart from the others. However, it ended up with 709 points due to the distance constraint. Of the sampling points, 43.3% (or 307 out of 709) had no images at all or duplicate views, because the points were too close to the other points. Street-level images were gathered from the remaining 402 locations, and the total number of collected scenes was 3032. At the time of this study, the application programming interface (API) for Kakao Road View allowed for the access to only the most recent view, so the images had to be collected manually.
At most of the sampling points, there were images for more than one timeframe, usually one or two per year. Each point had about 7.54 street-level images on average, resulting in more than 200 panoramic views for each year, except for 2008. The yearly distribution is given in Table 1: there seem to be notably more images taken in 2014 and 2018, so these two years, along with 2010, will be used for exploratory spatial data analysis in Section 4. It may be worth mentioning that all the street-level images were collected at a fixed resolution of 1280 × 760 pixels and that the viewing angle was determined to display as many signs and billboards as possible in a scene. The data collection process was conducted at 709 randomly selected sampling points in the study area. Initially, a random point generation tool implemented in QGIS attempted to create 1000 sampling points along the road network, each of which should be at least 10 meters apart from the others. However, it ended up with 709 points due to the distance constraint. Of the sampling points, 43.3% (or 307 out of 709) had no images at all or duplicate views, because the points were too close to the other points. Street-level images were gathered from the remaining 402 locations, and the total number of collected scenes was 3032. At the time of this study, the application programming interface (API) for Kakao Road View allowed for the access to only the most recent view, so the images had to be collected manually.
At most of the sampling points, there were images for more than one timeframe, usually one or two per year. Each point had about 7.54 street-level images on average, resulting in more than 200 panoramic views for each year, except for 2008. The yearly distribution is given in Table 1: there seem to be notably more images taken in 2014 and 2018, so these two years, along with 2010, will be used for exploratory spatial data analysis in Section 4. It may be worth mentioning that all the street-level images were collected at a fixed resolution of 1280 × 760 pixels and that the viewing angle was determined to display as many signs and billboards as possible in a scene.

Image Processing
As discussed in the previous section, the text detection and recognition from natural scenes are very challenging tasks. Although several state-of-art approaches are already implemented in open-source software, such as OpenCV [42], and major cloud computing vendors, including Amazon Web Services, Google Cloud, and Microsoft Azure, offer machine-learning-based applications, the detection and recognition rates are still unsatisfactory, especially for non-Latin characters. This difficulty is particularly problematic in the present paper, because the Korean and Chinese languages form the vast majority in the study area's linguistic landscape. To overcome this issue, a two-stage classification approach was adopted: it first applied a general feature detection method to the images for identifying all visible objects and then used a random forest algorithm for classifying them into those with and without text. By doing so, a relatively high text detection rate of 78.94% was achieved, as will be explained in this section.
The general feature detection was carried out through the Google Vision API. It identified a total of 146 unique objects from 3032 street-level images, and Table 2 shows some of the frequently observed features. Since Garibong-dong is part of a metropolitan area and has a mixture of residential and commercial land uses, relevant features, such as "neighborhood", "city", "residential area", "metropolitan area", "building", and "urban area", were commonly found ( Table 2). The frequent appearance of "street", "road", and "vehicle" might be reasonable as well, considering the fact that the images were originally taken along roads and streets. Interestingly, there were features like "advertising" and "signage" that directly indicate the presence of text in the scene. However, their numbers were relatively small (217 and 72 images, respectively, out of 3032), and many images containing signs and billboards did not include such features. To better classify the images into those with and without text (e.g., signs, billboards, and banners), a random forest algorithm was applied to the features detected. The training data set was created by manually classifying 50 images that were randomly selected. The classified groups were imbalanced in terms of size: more than two-thirds of the references (or 36 out of 50 images) had some signs or billboards, and only less than 30% displayed no text at all. Nevertheless, the algorithm with 500 random trees performed well, and the overall accuracy against the manual classification was about 78.94% (Table 3). For each image with signs or billboards, text regions were identified and marked using polygons, as shown in Figure 3. The recognition process was limited to the type of the dominant language in these polygons, not the contents written, to accelerate the task and improve its accuracy. Each of the text regions was coded into one of the following categories: 1 for Korean, 2 for English, 3 for Chinese, 4 for numbers, 5 for all other languages, and 6 for unrecognizable texts. This simplification may not be suitable for empirical studies, as it cannot distinguish among signs in different domains (e.g., public and private signs) and provides only limited insights into complex sociolinguistic phenomena. However, it was considered sufficient for demonstrating the feasibility of the proposed approach, which was the main goal of this work. To better classify the images into those with and without text (e.g., signs, billboards, and banners), a random forest algorithm was applied to the features detected. The training data set was created by manually classifying 50 images that were randomly selected. The classified groups were imbalanced in terms of size: more than two-thirds of the references (or 36 out of 50 images) had some signs or billboards, and only less than 30% displayed no text at all. Nevertheless, the algorithm with 500 random trees performed well, and the overall accuracy against the manual classification was about 78.94% (Table 3). For each image with signs or billboards, text regions were identified and marked using polygons, as shown in Figure 3. The recognition process was limited to the type of the dominant language in these polygons, not the contents written, to accelerate the task and improve its accuracy. Each of the text regions was coded into one of the following categories: 1 for Korean, 2 for English, 3 for Chinese, 4 for numbers, 5 for all other languages, and 6 for unrecognizable texts. This simplification may not be suitable for empirical studies, as it cannot distinguish among signs in different domains (e.g., public and private signs) and provides only limited insights into complex sociolinguistic phenomena. However, it was considered sufficient for demonstrating the feasibility of the proposed approach, which was the main goal of this work.  The images were first fed to the Text Detection feature in the Google Vision API, but many texts remained unrecognized or only partially recognized. The low recognition rate might be due to the presence of numerous non-Latin characters: in general, it is considered a more difficult task to read non-Latin characters, such as Koreans and Chinese, from natural scenes, because they usually have more letters and complex character shapes while there are less reference data sets for training deep neural networks [43]. As a result, the labor-intensive manual inspection of the images was necessary for the language detection, and it was the most time-consuming part of the present work. However, this is a field of very active research, and the advances in learning algorithms and the accumulation of reference data sets will eventually bring a more automatic exploration in this phase of the proposed approach.
Once the language recognition was complete, the area of polygons for each category was estimated and normalized, so that the total area in a single view remains 1. This was based on the idea that the relative area a specific language occupied in an image implied the impression one would receive when the person visited the place. In this way, a linguistic landscape can be more accurately quantified than by counting signs and billboards of arbitrary size in different languages. Table 4 presents the mean proportion of each language visible in street-level images over the years between 2008 and 2018. Korean was the most dominant language in the linguistic landscape of the study area for the entire period, and this result appears to be consistent with earlier observations. Lawrence [41], for example, conducted a survey on five commercial districts in Seoul and reported that the majority of signs was written in Korean (i.e., 66.3% in Gangnam, 71.3% in Insadong, 73.3% in Sadang, and 98.2% in Tteukseom), except for Itaewon (11.3%), where many shops are aimed at tourists and the United States military personnel. The analysis of street-level images, however, revealed an interesting temporal trend-in recent years, the proportion of Korean was decreased to a large extent, from 0.843 in 2008 to 0.778 in 2018, while the Chinese language became more visible during the same period. In 2008, Chinese signs and billboards accounted for 4.6% of the local linguistic landscape, but it was more than doubled in 2018. The highest figure was about 11.4% in 2017.  Figure 4 demonstrates one advantage of the proposed approach. The three maps showed the kernel density estimation surfaces of Chinese signs in the study area for 2010, 2014, and 2018, and they illustrated the spatiotemporal expansion of the Chinese linguistic community intuitively. This sort of visual comparison is not readily possible with field observations, because the data collection is often limited to a single year, and it requires tremendous efforts, time, and money to collect the minimum number of samples.

Results
While it is beyond the scope of this paper to investigate the underlying reasons behind the expansion of the Chinese linguistic landscape, the geographic distribution in Figure 4 provides some clues. On each map, the darker color represents the areas, in which Chinese signs are densely located, whereas the brighter color indicates the opposite. The Chinese language became more noticeable along the southern border of Garibong-dong, where ethnic restaurants and supermarkets were clustered, and this result implied that the increase was likely to be related to the recent popularity of Chinese food in the Korean society.
clues. On each map, the darker color represents the areas, in which Chinese signs are densely located, whereas the brighter color indicates the opposite. The Chinese language became more noticeable along the southern border of Garibong-dong, where ethnic restaurants and supermarkets were clustered, and this result implied that the increase was likely to be related to the recent popularity of Chinese food in the Korean society.  Figure 5 provides some supporting evidence for this hypothesis. The evidence depicted how the relative share of Chinese signs and billboards in the local linguistic landscape changed with the proportion of Chinese people. In 2018, the proportion of the Chinese population was about 7.4%, which was only slightly higher than in the preceding years, but the relative share of Chinese signage was almost twice that in the last year. The Chinese language comprised the most substantial proportion in 2017, but the population was largely unchanged, suggesting that the population structure itself cannot explain the increasing vitality of the Chinese language in this neighborhood. This is likely to be attributable to other factors, such as the growing favor towards ethnic food, although it needs further investigation to uncover the underlying sociolinguistic process.  Figure 5 provides some supporting evidence for this hypothesis. The evidence depicted how the relative share of Chinese signs and billboards in the local linguistic landscape changed with the proportion of Chinese people. In 2018, the proportion of the Chinese population was about 7.4%, which was only slightly higher than in the preceding years, but the relative share of Chinese signage was almost twice that in the last year. The Chinese language comprised the most substantial proportion in 2017, but the population was largely unchanged, suggesting that the population structure itself cannot explain the increasing vitality of the Chinese language in this neighborhood. This is likely to be attributable to other factors, such as the growing favor towards ethnic food, although it needs further investigation to uncover the underlying sociolinguistic process. expansion of the Chinese linguistic landscape, the geographic distribution in Figure 4 provides some clues. On each map, the darker color represents the areas, in which Chinese signs are densely located, whereas the brighter color indicates the opposite. The Chinese language became more noticeable along the southern border of Garibong-dong, where ethnic restaurants and supermarkets were clustered, and this result implied that the increase was likely to be related to the recent popularity of Chinese food in the Korean society.   Figure 5 provides some supporting evidence for this hypothesis. The evidence depicted how the relative share of Chinese signs and billboards in the local linguistic landscape changed with the proportion of Chinese people. In 2018, the proportion of the Chinese population was about 7.4%, which was only slightly higher than in the preceding years, but the relative share of Chinese signage was almost twice that in the last year. The Chinese language comprised the most substantial proportion in 2017, but the population was largely unchanged, suggesting that the population structure itself cannot explain the increasing vitality of the Chinese language in this neighborhood. This is likely to be attributable to other factors, such as the growing favor towards ethnic food, although it needs further investigation to uncover the underlying sociolinguistic process.

Discussion
Linguistic diversity in a neighborhood's landscape does not only reflect the demographic composition of an area, but it also mirrors the social and economic standing of minority population groups. A careful examination of a linguistic landscape can shed light on, for example, the formation and expansion of ethnic communities and offer a useful insight into the emergence of new retail centers for immigrants and tourists. Linguistic landscape studies have the potential to provide valuable information on how our urban areas are changing by the inflow of people with different backgrounds, but they require systematic comparative analysis across space and time.
The main contribution of this paper is the introduction of a new methodological framework that combines computer vision and machine learning techniques for assessing the diversity of languages from a large set of street-level images. While a great deal of work has been done on the evaluation of linguistic landscapes in different cities, most of the studies are based on ad-hoc interpretations of data collected from fieldwork. Relatively little focus has been given to the development of more data-driven and reproducible methods, and it has remained challenging to process an extensive collection of images. The proposed urban analytic approach can provide a way of exploring linguistic landscapes in a timely and consistent manner and consequently facilitate spatiotemporal comparisons of large geographic areas.
As a data-driven approach, it is capable of yielding a more objective interpretation of a linguistic landscape. Although field observations and interviews are imperative to uncover the underlying motivations and meanings behind the scene, it is essentially a subjective assessment of the information. The interpretations can be biased towards hypotheses or preconceptions, especially when they are made by inexperienced researchers. The present method gathers data from randomly selected locations and evaluates linguistic diversity based on numerical criteria, making it more robust to such bias. This can ultimately promote reproducible and transparent research in the field of the linguistic landscape.
The analysis of the Chinese community in Seoul, South Korea, has demonstrated the feasibility and advantages of the proposed approach to studying spatiotemporal variations of text in open spaces. The kernel density maps showed that the Chinese language was predominant in signage along commercial streets where ethnic restaurants were concentrated and that its spatial extent expanded over the years between 2010 and 2018. This result is plausible considering the increase of the Chinese population in the district and the growing popularity of Chinese food to local people. Although this paper limited the study area to a single administrative unit, it can be applied to several districts for comparisons or even to the entire city, as there are now massive sets of street-level images available from online map services and possibly more generated by dashboard cameras of vehicles. Further case studies on other cities using these data would be useful to validate and extend the methodological framework developed in this paper.
Even though the implementation of the proposed approach is somewhat at a conceptual level, it can be further developed into an unsupervised machine-based tool with rapid advances in computer vision techniques. Text detection and recognition from street-level images are a challenging task, because such scene texts vary in terms of size, color, font, and position. Images are often taken from nonuniform perspectives, and sometimes signs are only partially visible due to cars, pedestrians, or vegetation. In this paper, images were processed through several stages, from image classification to language recognition, but the image processing required considerable manual involvement. However, the technology has evolved fast since the advent of deep learning and artificial intelligence, and this computational progress will eventually lead to a more automatic exploration of the linguistic landscape.
Nonetheless, it should be noted that the development of this new methodological framework does not replace field observations nor in-depth interviews with local residents. It is not the intention of this paper to discount traditional ethnographic approaches, and the value of qualitative research as a means to understand complicated relationships between languages and society remains still. The present method provides an alternative way of exploring linguistic landscapes, which can be particularly useful for macroscale studies. The significance of this work comes from the fact that it bridges between linguistic landscape research and urban analytics, and this progress will benefit not only sociolinguists but also geographers and other social scientists, investigating various interrelated demographic and socioeconomic phenomena.