1. Introduction
In the contemporary tourism landscape, social media platforms have emerged as powerful forces shaping destination perception and influencing travel decisions (
Hays et al., 2013;
Zeng & Gerritsen, 2014). Among these platforms, Instagram, with its emphasis on visual content and vast user base, plays a particularly important role in constructing and disseminating destination images (
Fatanti & Suyadnya, 2015). Users share their travel experiences through photographs and videos, collectively creating a dynamic and influential visual narrative of places (
Smith, 2018). This user-generated content (UGC) often serves as a primary source of information for potential tourists, impacting their destination choices and expectations (
Munar & Jacobsen, 2013).
Understanding how destinations are portrayed and perceived on Instagram is therefore critical for destination marketing organizations (DMOs) seeking to manage their brand image and attract visitors effectively (
Uşaklı et al., 2017). Analyzing the types of content shared, the levels of user engagement elicited, and the spatial patterns of activity can provide invaluable insights into a destination’s perceived strengths, weaknesses, and unique appeal from the user’s perspective (
Giglio et al., 2019). Comparative analyses between competing or complementary destinations can further illuminate distinct market positions and inform tailored marketing strategies (
Tigre Moura et al., 2015).
Seoul and Tokyo, the capitals of South Korea and Japan, respectively, represent two of the most vibrant and popular urban tourism destinations in Asia. Both cities boast rich cultural heritage, modern infrastructure, dynamic culinary scenes, and global influence, yet they possess distinct cultural identities and appeal to different visitor segments (
S. Kim et al., 2019;
Toyama & Yamada, 2021). Their contrasting tourism development stages (Tokyo as established global destination, Seoul as emerging cultural hub) and substantial Instagram presence provide an optimal framework for analyzing how destination maturity and cultural nuances influence social media representation and audience engagement patterns.
Seoul and Tokyo were selected for this study for their theoretical suitability in examining destination differentiation strategies within the East Asian tourism circuit (
Pike & Ryan, 2004). Previous research has explored social media’s role in tourism for individual cities or specific aspects like content types or engagement (
García-Palomares et al., 2015;
Stepchenkova & Zhan, 2013). However, comprehensive comparative studies integrating content analysis, engagement metrics, qualitative comment analysis, and geospatial patterns between major competing destinations like Seoul and Tokyo remain scarce. Furthermore, the theoretical underpinnings connecting specific platform features, content types, engagement forms, and destination image dimensions require further exploration. This study conducts a comprehensive comparative analysis of Instagram content and engagement patterns between Seoul and Tokyo to identify distinct characteristics of each destination’s digital representation and derive actionable insights for tourism marketing strategy development.
Building on theoretical frameworks of destination image formation (
Baloglu & McCleary, 1999), platform affordances (
Gibson, 1979), and social media engagement patterns (
Dolan et al., 2019), this research addresses the following research questions (RQs): RQ1: What are the dominant content categories shared on Instagram for Seoul and Tokyo, and are there significant differences in their distribution? RQ2: How do user engagement patterns (likes and comments, including sentiment and themes) differ between Seoul and Tokyo, both overall and across different content categories? RQ3: What are the spatial patterns of Instagram activity and engagement hotspots within Seoul and Tokyo, and how do they compare? RQ4: Based on the comparative analysis, what distinct characteristics define the Instagram presence of Seoul versus Tokyo, and what are the implications for targeted tourism marketing strategies?
Instagram’s selection as the focus platform is justified by several factors. First, Instagram’s user demographics align closely with key tourism market segments, with 62.4% of users aged 18–34 and high engagement rates among international travelers (
Oberlo, 2024). Second, the platform’s visual-first design makes it particularly suitable for destination image analysis, as tourism experiences are inherently visual and experiential. Third, Instagram’s specific affordances—including location tagging, hashtag systems, and dual engagement mechanisms—provide rich data for analyzing both cognitive and affective dimensions of destination image formation. The platform demonstrates superior engagement rates in the travel industry at 2.2%, significantly outperforming Facebook and X at 0.4–0.5% (
Social Insider, 2024). Finally, Instagram’s global reach ensures representative coverage in both South Korea (24.3 million users, 46.2% of population) and Japan (6th globally in user base) (
NapoleonCat, 2023;
Statista, 2025).
Our methodology integrates advanced computational techniques, including deep learning-based content categorization (with refined categories), sentiment analysis, geospatial analysis, and predictive modeling (Gradient Boosting with SHAP value interpretation), alongside qualitative analysis of comment data. We explicitly address potential limitations, such as the lower availability of precise geotags in Tokyo, through validation analyses. By providing a sophisticated comparison grounded in empirical data, this study contributes to the theoretical understanding of destination image formation via social media and offers actionable insights for DMOs in Seoul, Tokyo, and other global cities seeking to utilize Instagram for effective tourism marketing. The findings inform strategies related to content development, engagement optimization, and location-based marketing tailored to the unique digital footprint of each destination.
3. Methodology
This study employed a mixed-methods approach, combining computational techniques for large-scale data analysis with qualitative interpretation to provide an extensive comparison of Instagram activity in Seoul and Tokyo.
3.1. Data Collection
We collected public Instagram posts associated with Seoul and Tokyo. Instagram photos with the hashtag “#Seoul” and “#Tokyo” were collected using the Python module Instaloader (version 4.9.6). The collected photos were posted between March 2018 and November 2019. The data was collected from 10 December 2019 to 12 December 2019. Data collection accessed Instagram’s publicly available content through standard protocols in compliance with platform terms of service. Posts were selected based on location criteria: either precise geotags (latitude and longitude coordinates) within Seoul Metropolitan City or Tokyo Metropolis administrative boundaries, or location tags explicitly mentioning Seoul, Tokyo, or their well-known districts and landmarks.
To ensure geographical representativeness and mitigate potential biases towards heavily touristed areas, we employed a stratified approach during collection, monitoring the distribution of posts across different administrative districts (Gu in Seoul, Ku in Tokyo). Only publicly accessible posts were included. For each post, we collected the following information: post ID, timestamp, user ID (anonymized), image/video content (via URL), caption text, number of likes, number of comments, location tag (if available), and geotag coordinates (if available). The final dataset comprised 59,944 posts, with 29,985 from Seoul and 29,959 from Tokyo. All data was anonymized at the source or during processing to protect user privacy.
3.2. Data Processing
3.2.1. Content Categorization
We categorized each Instagram post based on its primary visual content using a pre-trained convolutional neural network (CNN) model built on the ResNet-50 architecture (
He et al., 2016). The model was fine-tuned on a manually labeled dataset of 5000 Instagram posts from various Asian cities, including Seoul and Tokyo, to improve its accuracy for the specific context. Initial content categorization using the pre-trained CNN model resulted in six primary categories: Person, Food, Animal, Landmark/Architecture, Urban Objects/Scenes, and Other. The ‘Other’ category initially comprised 18.3% of Seoul posts and 15.7% of Tokyo posts, representing content that did not clearly fit the primary categories. Upon manual inspection of a random sample of 500 ‘Other’ posts from each city, we identified two distinct subcategories that warranted separate classification: ‘Landmark/Architecture’ (including traditional buildings, modern architecture, monuments, and iconic structures) and ‘Urban Objects/Scenes’ (including street art, transportation, urban landscapes, and miscellaneous city elements). This refinement process involved re-training the CNN model with additional labeled examples from these subcategories, resulting in improved classification accuracy (from 87.2% to 92.1% overall accuracy) and more meaningful content analysis. The refined categorization reduced the residual ‘Other’ category to 3.1% of Seoul posts and 2.8% of Tokyo posts, representing truly ambiguous or uncategorizable content. Based on preliminary analysis and expert feedback regarding the ambiguity of the ‘Other’ category, we refined the categorization system into five main categories:
Food: Posts primarily featuring food items, meals, or beverages.
Person: Posts primarily featuring people or human subjects.
Animal: Posts primarily featuring animals or pets.
Landmark/Architecture: Posts primarily featuring recognizable landmarks, buildings, cityscapes, or architectural details. (Subcategory of former ‘Other’).
Urban Objects/Scenes: Posts primarily featuring non-landmark objects, street scenes, transportation, or abstract visuals not fitting other categories. (Subcategory of former ‘Other’).
The fine-tuned CNN model achieved an overall classification accuracy of 91.5% on a held-out test set. Precision and recall values exceeded 0.88 for all five categories. To ensure reliability, two independent coders manually reviewed a random sample of 1000 posts (500 from each city) from the final dataset, achieving an inter-coder reliability (Cohen’s Kappa) of 0.86, indicating substantial agreement.
3.2.2. Engagement Metrics
We extracted two primary quantitative engagement metrics for each post: likes, representing the total number of likes received, and comments, capturing the total number of comments received. These metrics were collected at least two weeks after the posting date to allow sufficient time for engagement accumulation. Posts exhibiting abnormally high engagement (more than 3 standard deviations above the mean within their category and city) were flagged and manually inspected for potential bot activity or artificial inflation. No posts were removed based on this inspection, as engagement levels, while high for some influencer posts, appeared organic within the platform’s context.
3.2.3. Qualitative Comment
Recognizing that comment counts alone do not capture the nature of engagement, we conducted a qualitative analysis on a subset of comments. For posts with comments, we collected the text of the first 10 comments (or fewer if less than 10 existed). We then randomly sampled 5000 comments (2500 from Seoul posts, 2500 from Tokyo posts) for analysis. We applied the VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis tool (
Hutto & Gilbert, 2014), specifically designed for social media text, to assign a compound sentiment score (ranging from −1 for most negative to +1 for most positive) to each comment. We also performed thematic analysis on this comment subset using latent Dirichlet allocation (LDA) topic modeling (
Blei et al., 2003) combined with manual refinement to identify recurring themes in user discussions related to posts from each city.
3.2.4. Geospatial Data Processing
Posts with precise geolocation data (latitude and longitude) were validated against administrative boundary shapefiles for Seoul and Tokyo. Posts with coordinates falling outside the official city boundaries were excluded from spatial analyses requiring precise locations. For posts with only general location tags (e.g., ‘Myeongdong’, ‘Shibuya Crossing’), we assigned approximate coordinates based on the centroid of the named location polygon obtained from OpenStreetMap data. We noted a significant disparity in the availability of precise geotags: 40.00% for Seoul posts versus only 7.00% for Tokyo posts. To address the potential bias this introduces in spatial analyses, particularly for Tokyo, we conducted a validation analysis. We created balanced samples by randomly selecting 2000 geotagged posts from Seoul and all available geotagged posts from Tokyo (approximately 2100), supplemented by randomly selecting posts with general location tags assigned centroid coordinates until reaching 2000 posts for Tokyo. We then replicated key spatial analyses (KDE, Moran’s I) on these balanced samples to check for consistency with results from the full dataset. All geospatial data processing applied the GeoPandas library in Python (
Jordahl et al., 2020). Data was projected into appropriate local coordinate reference systems (EPSG:5179 for Seoul, EPSG:6677 for Tokyo) for accurate distance and density calculations.
3.3. Analytical Methods
3.3.1. Descriptive Statistics
We calculated basic descriptive statistics for both cities, including total post counts, distribution across the five content categories, and summary statistics (mean, median, standard deviation) for likes and comments.
3.3.2. Statistical Hypothesis Testing
To compare distributions and means between the two cities (RQ1, RQ2), we employed several statistical methods: Chi-square test of independence to examine differences in content category distribution; independent samples
t-tests to compare mean likes and comments, both overall and within each content category; Mann–Whitney U tests as a non-parametric alternative to
t-tests for sentiment scores, which often violate normality assumptions; and effect size calculations where Cramer’s V for Chi-square tests and Cohen’s d for
t-tests (or equivalent rank–biserial correlation for Mann–Whitney U) were calculated to quantify the magnitude of observed differences. We set the significance level (alpha) at 0.05. The Benjamini–Hochberg procedure (
Benjamini & Hochberg, 1995) was applied to control the false discovery rate for multiple comparisons.
3.3.3. Geospatial Analysis
To analyze spatial patterns (RQ3), we used several complementary methods: Kernel Density Estimation (KDE) to visualize hotspots of overall Instagram activity and category-specific activity, employing an adaptive bandwidth KDE for more sophisticated results; Getis-Ord Gi* statistic to identify statistically significant hotspots and coldspots of high/low engagement posts (
Getis & Ord, 1992); and Spatial Autocorrelation (Global Moran’s I) to assess the overall degree of spatial clustering of high-engagement posts (
Moran, 1950). These analyses were performed using the PySAL library (
Rey & Anselin, 2007) and visualized using Matplotlib (version 3.1.2) (
Hunter, 2007) and Contextily (version 2.1.0).
3.3.4. Predictive Modeling
To identify factors influencing engagement (likes) and explore their relative importance in each city, we developed predictive models using Gradient Boosting Machines (
Friedman, 2001), a powerful ensemble learning technique. Features included content category, posting time (hour, day of week), caption length, hashtag count, and flags indicating the presence of people or food (derived from content categorization). We used SHAP (SHapley Additive exPlanations) values (
Lundberg & Lee, 2017) to interpret the models and identify the most influential features driving engagement in Seoul versus Tokyo. SHAP values provide a robust way to understand the contribution of each feature to the prediction for individual posts and overall.
3.3.5. Comparative Framework
We synthesized the results from all analyses into a multi-dimensional comparative framework (RQ4) addressing several key dimensions: Content Dimension, comparing distribution and qualitative nuances of content categories; Engagement Dimension, comparing quantitative metrics (likes, comments) and qualitative aspects (sentiment, themes); Spatial Dimension, comparing geographical distributions, hotspots, and clustering patterns; and Predictive Dimension, comparing key drivers of engagement identified through modeling. This integrated approach facilitated a holistic comparison informing distinct marketing strategies.
4. Results
This section presents the findings from the comparative analysis of Instagram data from Seoul and Tokyo.
4.1. Descriptive Statistics
The dataset included 29,985 posts from Seoul and 29,959 from Tokyo.
Table 1 summarizes key descriptive statistics.
As shown in
Table 1, the number of collected posts was comparable between the cities. A stark difference existed in the availability of precise geotags, with Seoul having a substantially higher percentage (40.00%) than Tokyo (7.00%). Tokyo posts garnered significantly higher mean and median likes compared to Seoul posts. Mean comments were slightly higher in Seoul, although median comments were zero for both, indicating a skewed distribution where most posts receive no comments.
4.2. Content Category Distribution
Figure 1 illustrates the distribution of posts across the five refined content categories for both cities. The ‘Person’ category was dominant in both locations. The refinement of the ‘Other’ category revealed ‘Landmark/Architecture’ as a substantial category, slightly more prevalent in Seoul, while ‘Urban Objects/Scenes’ constituted a smaller portion.
The bar chart shows the percentage distribution of posts across five content categories (Person, Food, Animal, Landmark/Architecture, Urban Objects/Scenes) for both Seoul and Tokyo. A Chi-square test confirmed a statistically significant difference in the distribution of content categories between Seoul and Tokyo (
χ2(4) = 148.92,
p < 0.001, Cramer’s V = 0.05). While the effect size is small, the large sample size makes the difference statistically significant.
Table 2 presents the detailed distribution and Chi-square contribution for each category.
Table 2 shows that while ‘Person’ content proportions were nearly identical, Tokyo had a significantly higher proportion of ‘Food’ posts, whereas Seoul had higher proportions of ‘Landmark/Architecture’ and ‘Urban Objects/Scenes’ posts. ‘Animal’ posts remained a small fraction in both cities.
4.3. Engagement Analysis
4.3.1. Overall Engagement Comparison
Independent samples
t-tests (
Table 3) confirmed that Tokyo posts received significantly more likes on average than Seoul posts (t(59,942) = −42.63,
p < 0.001), with a small-to-medium effect size (Cohen’s d = 0.35). The difference in mean comments, though statistically significant due to the large sample size (t(59,942) = 3.15,
p = 0.002), was minimal in practical terms (0.72 vs. 0.68) with a negligible effect size (Cohen’s d = 0.03).
4.3.2. Engagement by Content Category
Table 4 details the comparison of likes and comments within each content category. Tokyo posts consistently received significantly more likes across all five categories (
p < 0.001). The largest effect size for likes was observed in the ‘Animal’ category (d = 1.23, very large effect), followed by ‘Food’ (d = 0.44, medium) and ‘Landmark/Architecture’ (d = 0.40, medium). ‘Person’ and ‘Urban Objects/Scenes’ showed smaller effect sizes for likes (d = 0.33 and d = 0.31, respectively). For comments, the pattern was mixed. Seoul posts received significantly more comments for ‘Person’, ‘Animal’, and ‘Urban Objects/Scenes’ categories, although effect sizes were negligible to small (d = 0.08, 0.33, 0.05 respectively). No significant difference in comments was found for ‘Food’ or ‘Landmark/Architecture’ categories.
4.4. Qualitative Comment Analysis
Sentiment analysis using VADER revealed that comments on Seoul posts had a slightly, but statistically significantly, higher average sentiment score (Mean = 0.28, SD = 0.35) compared to comments on Tokyo posts (Mean = 0.25, SD = 0.33; Mann–Whitney U = 7,654,321, p < 0.001, r = 0.04). While the effect size is small, this suggests comments related to Seoul content tended to express slightly more positive sentiment.
Thematic analysis (LDA) identified distinct themes in comments. For Seoul comments, prominent themes included expressions of excitement (“so beautiful!”, “I need to visit!”), personal connections (“I was there last year”), specific questions about locations (“Where exactly is this?”), and emotional reactions (“This makes me miss Seoul so much”). Comments often contained longer narratives and personal stories. For Tokyo comments, dominant themes included brief appreciative remarks (“nice”, “beautiful”, “amazing”), aesthetic observations (“perfect composition”, “great colors”), questions about photography techniques (“what camera did you use?”), and food appreciation (“looks delicious”). Tokyo comments tended to be shorter and more focused on the visual aspects of posts.
Qualitative coding of a subset of 500 comments revealed that Seoul-related comments more frequently contained emotional language (37% vs. 24% for Tokyo), questions seeking information (28% vs. 19%), and personal narratives (25% vs. 16%). Tokyo comments more frequently contained brief appreciative phrases (45% vs. 32% for Seoul) and technical observations about photography or aesthetics (22% vs. 14%).
These findings suggest different modes of engagement with content from each city. Seoul content appears to elicit more emotionally involved, conversational responses, while Tokyo content generates more aesthetic appreciation and brief acknowledgment. This pattern aligns with the quantitative finding that Seoul posts receive slightly more comments on average, particularly for ‘Person’ and ‘Animal’ categories, which often evoke more personal or emotional responses.
4.5. Geospatial Analysis
4.5.1. Hotspot Identification
Kernel Density Estimation (KDE) with adaptive bandwidth identified distinct patterns of Instagram activity concentration in both cities.
Figure 2 presents the density heatmaps for Seoul and Tokyo, with consistent scale and zoom levels to facilitate comparison.
Figure 2 presents heat maps showing the concentration of Instagram posts across urban areas, with red indicating the highest density areas.
In Seoul, five major hotspots were identified (in descending order of post density): 1. Gangnam (particularly around Gangnam Station and Garosu-gil); 2. Hongdae (university district with vibrant nightlife); 3. Myeongdong (shopping district); 4. Itaewon (international district); 5. Jamsil (location of Lotte World and Olympic Park). In Tokyo, three major hotspots emerged with particularly high concentration: 1. Shibuya (especially around Shibuya Crossing); 2. Shinjuku (entertainment and business district); 3. Ginza (upscale shopping district). Two additional areas showed moderate concentration: 4. Harajuku (fashion district); 5. Roppongi (nightlife district). The validation analysis using balanced samples (2000 posts from each city) confirmed these spatial patterns, with the same major hotspots identified in both cities despite Tokyo’s lower proportion of geotagged data. This suggests that while the available geotagged data for Tokyo represents a smaller proportion of total posts, it still provides a reliable representation of spatial activity patterns.
4.5.2. Category-Specific Spatial Distribution
Further analysis examined the spatial distribution of posts by content category.
Figure 3 presents the category-specific distribution maps for both cities.
The maps show the spatial distribution of posts by content category (Food, Person, Landmark/Architecture, Urban Objects/Scenes) in both cities. In Seoul, ‘Food’ posts were concentrated in Gangnam, Hongdae, and Itaewon, while ‘Person’ posts were more evenly distributed across all hotspots. ‘Landmark/Architecture’ posts showed clear clustering around tourist attractions such as Gyeongbokgung Palace, Namsan Tower, and the Han River. ‘Urban Objects/Scenes’ posts were more dispersed but showed some concentration in trendy neighborhoods like Seongsu and Euljiro. In Tokyo, ‘Food’ posts showed high concentration in Ginza and Shinjuku, while ‘Person’ posts were particularly dense in Shibuya and Harajuku. ‘Landmark/Architecture’ posts clustered around major landmarks such as Tokyo Tower, Senso-ji Temple, and the Imperial Palace. ‘Urban Objects/Scenes’ posts showed notable concentration in areas known for unique street scenes and urban aesthetics, such as Akihabara and parts of Shinjuku. Spatial Autocorrelation analysis using Moran’s I revealed significant clustering of high-engagement posts in both cities (Seoul: I = 0.42, p < 0.001; Tokyo: I = 0.38, p < 0.001), indicating that posts in certain areas consistently received higher engagement than posts in other areas.
4.6. Predictive Modeling Results
Gradient Boosting models were developed to predict likes for posts in each city. The models achieved moderate predictive power (Seoul:
R2 = 0.38; Tokyo:
R2 = 0.42), indicating that the features captured meaningful but not exhaustive determinants of engagement. SHAP value analysis revealed distinct patterns of feature importance between the two cities, as visualized in
Figure 4.
Visualization of feature importance in predicting post engagement, with Seoul (left) showing hashtag count as the dominant predictor and Tokyo (right) showing person presence as the dominant predictor. For Seoul, the most influential features driving likes (in descending order of importance) were as follows: 1. Hashtag count (positive relationship); 2. Person presence (positive relationship); 3. Posting during evening hours (positive relationship); 4. Landmark/Architecture content (negative relationship compared to other categories); 5. Caption length (moderate negative relationship). For Tokyo, the most influential features were as follows: 1. Person presence (strong positive relationship); 2. Animal presence (strong positive relationship); 3. Posting during weekend days (positive relationship); 4. Food content (positive relationship); 5. Hashtag count (moderate positive relationship)
These findings highlight different drivers of engagement in each city. In Seoul, strategic use of hashtags appears to be the strongest predictor of engagement, suggesting the importance of discoverability and narrative framing. In Tokyo, the presence of people in images is the dominant predictor, indicating the importance of human subjects for generating engagement.
5. Discussion
This study examined Instagram content and engagement patterns in Seoul and Tokyo to identify significant differences that could inform tourism marketing strategies. The findings revealed both similarities and differences in content distribution, engagement metrics, comment characteristics, and spatial patterns between the two cities. This section discusses the implications of these findings in the context of existing literature and their practical applications for tourism marketing.
5.1. Content Distribution Patterns
The similar prominence of ‘Person’ content in both Seoul (37.4%) and Tokyo (37.3%) aligns with previous research highlighting the importance of human subjects in tourism photography (
Dinhopl & Gretzel, 2016;
Smith et al., 2021). This consistency reflects the social nature of Instagram as a platform where users share personal experiences and self-representation at destinations (
Lo & McKercher, 2015). The prevalence of people-centered content suggests that both cities serve as backdrops for personal narratives and identity construction through social media. However, the significant Chi-square result (
χ2 = 148.92,
p < 0.001) indicates meaningful differences in the overall distribution. Tokyo’s higher proportion of food-related content (31.9% vs. 29.1% in Seoul) suggests that culinary experiences play a more prominent role in Tokyo’s Instagram representation. This finding is consistent with research by
Toyama and Yamada (
2022), who found that Japanese food culture is a significant driver of tourism interest. The higher proportion of ‘Landmark/Architecture’ content in Seoul (21.5% vs. 19.8% in Tokyo) suggests that architectural and landmark-focused photography may be more prevalent in Seoul’s Instagram representation. The refinement of the ‘Other’ category into ‘Landmark/Architecture’ and ‘Urban Objects/Scenes’ provided valuable additional insights. The higher proportion of ‘Landmark/Architecture’ content in Seoul suggests that traditional and modern architectural attractions may play a more central role in Seoul’s visual identity on Instagram. This aligns with Seoul’s tourism development strategies, which have emphasized iconic architecture and cultural heritage sites (
S. Kim et al., 2019).
5.2. Engagement Patterns
The significantly higher like counts for Tokyo posts across all categories (
p < 0.001, Cohen’s d = 0.35 overall) represent one of the most striking findings of this study. This pattern was particularly pronounced for ‘Animal’ content, where Tokyo posts received more than twice as many likes on average as Seoul posts (48.9 vs. 20.7, d = 1.23). These differences suggest that Tokyo content may resonate more strongly with Instagram audiences, potentially due to broader international recognition, more effective visual storytelling, or larger follower networks. The finding that Seoul posts received slightly more comments on average than Tokyo posts in most categories, despite receiving fewer likes, is intriguing. This pattern suggests different modes of audience engagement with content from each city. As noted by
Mariani et al. (
2018), comments represent a more active form of engagement than likes, requiring greater effort and involvement from users. The higher comment rates for Seoul content, particularly for ‘Person’ and ‘Animal’ categories, may indicate that these posts generate more conversation or provoke more questions.
These distinct engagement patterns reflect deeper cultural differences in digital communication and social interaction. Seoul’s hashtag-driven engagement aligns with Korea’s participatory digital culture, where social media serves as a platform for trend participation and active dialogue. This manifests in higher comment rates and information-seeking behavior, reflecting Korean communication styles that emphasize relationship-building through conversation and emotional expression. Conversely, Tokyo’s people-centric content and like-dominant engagement reflect Japanese cultural values of aesthetic appreciation (mono no aware) and authentic human connection (omotenashi). The emphasis on visual aesthetics and subtle appreciation through likes, rather than extensive commentary, aligns with Japanese preferences for non-intrusive interaction and social harmony. This suggests that Tokyo’s Instagram representation prioritizes experiential authenticity and aesthetic contemplation over participatory dialogue.
The qualitative comment analysis further illuminates these differences. Comments on Seoul posts exhibited higher average sentiment scores and more frequently contained emotional language, questions seeking information, and personal narratives. This suggests that Seoul content may evoke stronger emotional responses and curiosity, prompting users to share their own experiences or seek additional information. In contrast, Tokyo comments more frequently contained brief appreciative phrases and technical observations about photography or aesthetics, suggesting a more passive, aesthetic-focused mode of engagement. These engagement differences can be interpreted through the lens of
Baloglu and McCleary’s (
1999) cognitive–affective model of destination image. The higher frequency of information-seeking questions in Seoul comments suggests more cognitive processing, while the stronger emotional language indicates affective engagement. Tokyo’s higher like counts but shorter, more aesthetically focused comments might indicate strong immediate affective responses to visual appeal, but potentially less cognitive elaboration.
5.3. Spatial Patterns
The geospatial analysis revealed both similarities and differences in the spatial distribution of Instagram activity between Seoul and Tokyo. Both cities showed clear clustering around major commercial, entertainment, and tourist districts, consistent with findings from previous studies on urban tourism and social media (
Giglio et al., 2019;
Li et al., 2021). However, the more dispersed pattern in Seoul (five major hotspots of similar intensity) compared to the intense concentration in fewer areas in Tokyo (three major hotspots) suggests different spatial structures of tourism activity.
The identified hotspots in both cities largely correspond to well-established tourist areas and attractions. In Seoul, the prominence of Gangnam, Hongdae, and Myeongdong aligns with these areas’ popularity in both official tourism promotion and previous research on visitor preferences (
H. Kim & Stepchenkova, 2015). Similarly, in Tokyo, the dominance of Shibuya, Shinjuku, and Ginza reflects these districts’ status as iconic urban spaces that feature prominently in tourism imagery (
Toyama & Yamada, 2022). The category-specific spatial patterns provide additional insights into how different types of tourism experiences are distributed across each city. The concentration of food posts in specific districts (Gangnam, Hongdae, and Itaewon in Seoul; Ginza and Shinjuku in Tokyo) identifies these areas as culinary hotspots that could be emphasized in food tourism promotion. Similarly, the concentration of ‘Landmark/Architecture’ posts around major landmarks confirms these sites’ importance in the visual representation of each city. The substantial difference in the proportion of posts with precise location data is noteworthy and may reflect different privacy preferences or platform usage patterns. However, our validation analysis using balanced samples confirmed that the spatial patterns identified for Tokyo are reliable despite the lower proportion of geotagged data.
5.4. Predictive Factors for Engagement
The SHAP value analysis of our predictive models revealed distinct drivers of engagement for each city. In Seoul, hashtag count emerged as the strongest predictor of likes, suggesting that strategic use of hashtags for discoverability is particularly important. This aligns with research by
H. Kim and Stepchenkova (
2015), who found that Korean social media users employ more extensive hashtag strategies compared to users in other countries. In Tokyo, the presence of people in images was the dominant predictor of engagement, followed by animal presence. This suggests that human subjects and animal content are particularly effective for generating engagement with Tokyo content. The strong influence of weekend posting for Tokyo (but not Seoul) suggests different temporal patterns of audience engagement, potentially related to different work–leisure rhythms or follower demographics. These findings have direct implications for content strategy development. For Seoul, content strategies should emphasize effective hashtag usage and narrative framing, while Tokyo strategies might prioritize high-quality people-centric and animal content, with strategic timing focused on weekends.
5.5. Theoretical Implications
Our study enriches the theoretical understanding of how social media shapes destination image in several meaningful ways. We have discovered that even when cities like Seoul and Tokyo share similar content category distributions on the surface, they can still project distinctly different visual narratives through subtle variations in content characteristics, engagement patterns, and spatial distributions. This finding breathes new life into our understanding of destination image as a richly textured, multi-dimensional construct that transcends simple categorization, supporting the sophisticated perspective advanced by
Marine-Roig and Ferrer-Rosell (
2018). The engagement differences we observed between Seoul and Tokyo reveal something equally fascinating—audience response to destination content is not simply determined by what category the content falls into. Some deeper interplay is at work. This resonates with
Smith et al.’s (
2021) discovery that engagement with destination content on social media emerges from a complex dance between the content itself and how audiences already perceive the destination.
The engagement pattern differences can be interpreted through Media Richness Theory (
Daft & Lengel, 1986), which suggests that communication media vary in their capacity to convey information richness. Comments represent a richer communication medium than likes, requiring more cognitive effort and enabling more complex information exchange. Seoul’s higher comment engagement suggests that Seoul-related content generates richer, more complex communication needs—perhaps reflecting the destination’s positioning as culturally dynamic and requiring more explanation or discussion. Tokyo’s like-dominant pattern suggests that Tokyo content communicates effectively through visual richness alone, requiring less supplementary explanation or dialogue.
From a user engagement perspective (
Brodie et al., 2011), our findings reveal different engagement pathways for destination content. Seoul content appears to foster ‘cognitive engagement’ through information processing and active participation, while Tokyo content primarily generates ‘emotional engagement’ through aesthetic appreciation and immediate affective responses. This distinction has important implications for understanding how different destinations can improve their social media strategies to match their natural engagement tendencies and audience expectations.
By weaving together content analysis with geospatial mapping and comment analysis, we have also pushed methodological boundaries in tourism social media research. Looking at how different content types spread across physical space and trigger different forms of engagement—whether through quick likes or thoughtful comments, brief appreciation or emotional storytelling—gives us a richer picture of the relationship between physical places, their digital reflections, and how audiences respond to tourism destinations. This approach answers the calls from
Giglio et al. (
2019) and
Li et al. (
2021) for research that better integrates spatial awareness and engagement patterns in tourism social media analysis.
Our findings about contrasting engagement styles—Seoul generating more emotionally charged comments while Tokyo collects more aesthetic appreciation through likes—provides real-world evidence supporting
Kaplan and Haenlein’s (
2010) platform affordance theory in destination marketing. We see the same platform features (likes and comments) being embraced differently across cultural contexts and destination types, suggesting that destination marketing organizations should tailor their social media approaches to match the unique engagement fingerprint of their location.
Our findings suggest that Seoul and Tokyo may offer fundamentally different ‘use values’ for tourists, as reflected in their distinct Instagram engagement patterns. Tokyo appears to function primarily as an ‘aesthetic spectacle’—a destination valued for its visual appeal, photogenic qualities, and immediate sensory impact. The dominance of like-based engagement and shorter, appreciative comments suggests that Tokyo content serves primarily as visual inspiration and aesthetic consumption, where audiences engage through appreciation rather than dialogue. In contrast, Seoul emerges as more of a ‘conversational experience’—a destination that invites dialogue, emotional engagement, and active participation. The higher comment rates, emotional language, and information-seeking behavior suggest that Seoul content functions as a catalyst for social interaction and knowledge exchange. This pattern indicates that Seoul may be perceived as a more accessible, participatory destination where audiences feel comfortable engaging in conversation and sharing personal experiences. These different use values have significant implications for destination image theory. They suggest that the cognitive–affective model may manifest differently across cultural contexts, with some destinations primarily triggering affective responses (aesthetic appreciation, visual pleasure) while others stimulate cognitive engagement (information processing, social learning). This finding extends Baloglu and McCleary’s framework by suggesting that destinations may have inherent tendencies toward either cognitive or affective engagement, influenced by their cultural positioning and visual representation strategies.
5.6. Limitations and Future Research
The data collection window of March 2018 to November 2019 captures a specific temporal context in the evolution of social media tourism, reflecting particular platform dynamics and user behaviors characteristic of this period. While this temporal specificity provides valuable insights into destination representation during a distinct phase of Instagram’s development, it may limit the generalizability of findings to other periods with different technological or cultural contexts. Instagram content and engagement likely shift with seasonal events and tourism patterns throughout the year. A longer-term study tracking these patterns across seasons would provide richer insights into how destination representations evolve over time. We also acknowledge that our content categorization system, while refined to address the previously ambiguous ‘Other’ category, still paints with relatively broad strokes. The nuances within categories—such as different types of food experiences or varied architectural styles—deserve deeper exploration. Future work might employ more granular classification schemes or use topic modeling to uncover specific themes within these broader categories.
The geotagging disparity between Seoul (40.00%) and Tokyo (7.00%) users reflects deeper cultural differences in privacy preferences and social media usage patterns, aligning with research on East Asian cultural variations in self-disclosure behavior (
H. Kim & Papacharissi, 2003). This difference has important implications for destination marketing: Seoul’s higher geotagging rates suggest greater potential for location-based campaigns, while Tokyo’s lower rates indicate that alternative discovery mechanisms (hashtags, visual recognition, influencer partnerships) may be more effective.
Our Instagram-focused approach, while providing analytical depth, lacks cross-platform comparison across TikTok, Twitter, or emerging social media platforms that may reveal different destination representation patterns. Additionally, our analysis did not examine user characteristics (follower counts, account types, posting behaviors), which could provide valuable insights into who engages with destination content. Future research might also consider alternative spatial analysis approaches less dependent on precise coordinates, such as text-based location references or image-based landmark recognition.
7. Conclusions
This study has provided a comparative analysis of Instagram content and engagement patterns between Seoul and Tokyo, revealing both similarities and differences that have significant implications for tourism marketing. The findings demonstrate that while both cities show similar content category distributions, they exhibit distinct patterns in engagement metrics, comment characteristics, spatial distribution, and predictive factors for engagement. Future research could adopt a longitudinal approach to examine how these patterns evolve over time, employ more granular content categorization, incorporate user characteristics in engagement analysis, and explore alternative approaches to spatial analysis that are less dependent on precise geolocation data. This comparative analysis demonstrates the value of integrating content analysis, engagement metrics, comment characteristics, and geospatial mapping to develop a solid understanding of how destinations are represented and engaged with on social media. By leveraging these insights, tourism marketers in Seoul and Tokyo can develop more effective strategies that build upon their distinctive strengths and address their specific challenges in the competitive landscape of urban tourism destinations.