# Combining Design Patterns and Topic Modeling to Discover Regions That Support Particular Functionality

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

- A novel framework for discovering functional regions that combines results based on patterns and LDA topic modeling in three different ways: mutual evaluation to identify cases of significant agreement or disagreement; using pattern-based knowledge to adjust topic probabilities; and using topic probabilities to adjust pattern-based results.
- A discussion, in the context of GIS, of the benefits of combining the interpretability offered by knowledge-based techniques with the transferability and scalability of data-driven methodologies.

## 2. Related Work and Critical Analysis

#### 2.1. Knowledge-Based Approaches

#### 2.1.1. Discovering Functional Regions using Composition Patterns

#### 2.1.2. Critical Analysis of the Pattern-Based Approach

#### 2.2. Data-Driven Approaches

^{2}regions in New York and London to find clusters containing similar distributions of place types. Zhou and Zhang [26] similarly combined Twitter and Foursquare data to extract spatial distributions of common human activities (e.g., food and restaurants, shops and services, outdoor and recreation) and to determine major hotspots. Finally, Zhi et al. [27] used a vast dataset of 15 million social media check-ins over a year to detect functional regions. Spatiotemporal structures which potentially represent associations between functional regions and human activities were extracted; these associations were then used to discover functional regions in the city of Shanghai.

#### 2.2.1. Functional Region Extraction from POI and Human Activity Data

#### 2.2.2. Critical Analysis of the Topic Modeling Approach

## 3. Methodology

#### 3.1. Mutual Evaluation

#### 3.2. Data to Knowledge Fusion

#### 3.3. Knowledge to Data Fusion

## 4. Demonstration and Results

#### 4.1. Study Area and Data

#### 4.2. Results Using Individual Approaches

^{2}). Figure 2 presents the results of each individual approach on the same map. Darker hues indicate higher probability of the region being a “shopping plaza”, with red and gray colours denoting results using the pattern-based and topic modeling approach, respectively. Figure 3 presents the results of a primitive integration process that does not follow any of the proposed methodologies in Section 3: It simply includes only those results from both approaches that overlap and score higher than 50%. A pie chart is also provided, showing how each category of sub-functions within the pattern contributes to the confidence value.

#### 4.3. Results of Mutual Evaluation

#### 4.4. Results of Data to Knowledge and Knowledge to Data Fusion

#### 4.5. Overall Results

## 5. Discussion

- are highly functional, also explaining which particular functions mostly contribute to this, as derived from the knowledge-based aspect;
- are popular, based on the inclusion of social media information exploited by the data-driven aspect;
- are homogeneous both in terms of the POIs included and the way they are spatially organized.

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Table A1.**Top-15 ranked point of interest (POI) types for the “shopping plaza” topic in [12].

Category | Probability | Category | Probability |
---|---|---|---|

shopping mall | 0.207709 | bistro | 0.000105 |

accessories store | 0.056738 | dumpling restaurant | 0.000096 |

chocolate shop | 0.013896 | korean restaurant | 0.000090 |

shoe store | 0.000288 | german restaurant | 0.000080 |

breakfast spot | 0.000282 | herbs & spices store | 0.000079 |

gaming cafe | 0.000196 | airport terminal | 0.000078 |

optical shop | 0.000180 | outlet store | 0.000076 |

post office | 0.000114 |

Variable | Component | Filter |
---|---|---|

${C}_{S}$ | Shop | $Type\_Filter(\u201cShop\u201d)$ |

${C}_{A}$ | Amenity | $Type\_Filter(\u201cAmenity\u201d)$ |

${C}_{F}$ | Facilities | ${C}_{S}\cup {C}_{A}$ |

${C}_{WP}$ | Walkable plaza | $Type\_Filter(\u201cSurface\u201d)\phantom{\rule{3.33333pt}{0ex}}\cap $ |

$Prop\_Filter(\u201cwalkable\u201d,\u201ctrue\u201d)$ | ||

${C}_{H}$ | Motorway | $Type\_Filter(\u201cRoad\u201d)\phantom{\rule{3.33333pt}{0ex}}\cap $ |

$Prop\_Filter(\u201cpedestrians\u201d,\u201cfalse\u201d)$ | ||

${C}_{Sr}$ | Service Road | ${C}_{H}\cap Prop\_Filter(\u201cpedestrians\u201d,\u201ctrue\u201d)$ |

${C}_{W}$ | Walkable | ${C}_{WP}\cup {C}_{Sr}$ |

${C}_{P}$ | Parking place | ${C}_{A}\cap Prop\_Filter(\u201cservice\u201d,\u201cparking\u201d)$ |

${C}_{B}$ | Transportation node | ${C}_{A}\cap Prop\_Filter(\u201cservice\u201d,\u201ctransportation\u201d)$ |

${C}_{An}$ | Anchor Store | ${C}_{S}\cap Prop\_Filter(\u201cgoods\u201d,\u201cvarious\u201d)$ |

${C}_{M}$ | Mall | ${C}_{S}\cap Prop\_Filter(\u201cgoods\u201d,\u201cvarious\u201d)\phantom{\rule{3.33333pt}{0ex}}\cap $ |

$Prop\_Filter(\u201cservic{e}^{\u201d},\u201cvarious\u201d)$ | ||

${C}_{At}$ | Attractors | ${C}_{M}\cap {C}_{An}$ |

${C}_{Sb}$ | Basic Shop | ${C}_{S}\cap Prop\_Filter(\u201cgoods\u201d,\u201cbasic\u201d)$ |

${C}_{Se}$ | Special Shop | ${C}_{S}\cap Prop\_Filter(\u201cgoods\u201d,\u201cspecial\u201d)$ |

${C}_{Su}$ | Uncommon Shop | ${C}_{F}\cap (Prop\_Filter(\u201cgoods\u201d,\u201cuncommon\u201d)\cup $ |

$Prop\_Filter(\u201cservices\u201d,\u201cuncommon\u201d))$ | ||

${C}_{As}$ | Food court | ${C}_{A}\cap Prop\_Filter(\u201cservice\u201d,\u201csustenance\u201d)$ |

${C}_{Ae}$ | Entertainment | ${C}_{A}\cap Prop\_Filter(\u201cservice\u201d,\u201centertainment\u201d)$ |

${C}_{Al}$ | Luxury services | ${C}_{A}\cap Prop\_Filter(\u201cservice\u201d,\u201chealth\&beauty\u201d)$ |

${C}_{Av}$ | Aesthetics | ${C}_{A}\cap Prop\_Filter(\u201cservice\u201d,\u201cvisuallypleasing\u201d)$ |

Functional Implications | |
---|---|

Functions$\left(\mathcal{F}\right)$ | Logical Formula |

${F}_{W}({C}_{Sb},\phantom{\rule{3.33333pt}{0ex}}{C}_{At},\phantom{\rule{3.33333pt}{0ex}}{C}_{W},\phantom{\rule{3.33333pt}{0ex}}{C}_{Sr})$ (Walkability) | $Occurrence({C}_{W},\mathbb{N})\wedge ((Occurrence({C}_{Sb},[5,\infty ))\wedge Proximity({C}_{Sb},{C}_{Sb},(0,500m])\wedge \phantom{\rule{3.33333pt}{0ex}}S\_Relation({C}_{W},{C}_{Sb},\left[\mathit{intersects}\right]))\vee \left(Occurrence({C}_{At},[1,\infty ))\right)\wedge \phantom{\rule{3.33333pt}{0ex}}S\_Relation({C}_{W},{C}_{At},\left[\mathit{intersects}\right]))$ |

${F}_{SE}({C}_{At},\phantom{\rule{3.33333pt}{0ex}}{C}_{Sb},\phantom{\rule{3.33333pt}{0ex}}{C}_{W})$ (Shopping Experience) | ${F}_{W}\wedge (Occurrence({C}_{Sb},[5,\infty )\wedge S\_Relation({C}_{W},{C}_{Sb},\left[intersects\right]))\vee (Occurrence({C}_{At},[1,\infty )\wedge S\_Relation({C}_{W},{C}_{At},\left[contains\right]))$ |

${F}_{SV}\left({C}_{Sb}\right)$ (Shopping Variety) | ${F}_{SE}\wedge Occurrence({C}_{Sb},[5,\infty ))$ |

${F}_{AT}\left({C}_{Sb}\right)$ (Sh. Attractiveness) | ${F}_{SE}\wedge Occurrence({C}_{At},[1,\infty ))$ |

${F}_{SD}({C}_{Sb},{C}_{Se})$ (Sh. Orientation) | ${F}_{SE}\wedge Correlation({C}_{Sb},{C}_{Se},[2,\infty ))$ |

${F}_{SG}\left({C}_{Se}\right)$ (Special Goods) | ${F}_{SE}\wedge Occurrence({C}_{Se},\mathbb{N})$ |

${F}_{CC}({C}_{Sb},\phantom{\rule{3.33333pt}{0ex}}{C}_{At},\phantom{\rule{3.33333pt}{0ex}}{C}_{Su},\phantom{\rule{3.33333pt}{0ex}}{C}_{W})$ (Compatible Components) | ${F}_{SE}\wedge Occurrence({C}_{Su},\mathbb{N})\wedge (Correlation({C}_{Sb}\cup {C}_{At},{C}_{Su},[5,\infty ))\vee Proximity({C}_{W},{C}_{Su},[500m,\infty )))$ |

${F}_{SO}({C}_{S},{C}_{A})$ (Shopping Opportunities) | ${F}_{SE}\wedge Occurrence({C}_{A},\mathbb{N})\wedge Correlation({C}_{S},{C}_{A},[2,\infty ))$ |

${F}_{L}\left({C}_{As}\right)$ (Leisure) | ${F}_{SO}\wedge Occurrence({C}_{As},\mathbb{N})$ |

${F}_{E}\left({C}_{Ae}\right)$ (Entertainment) | ${F}_{SO}\wedge Occurrence({C}_{Ae},\mathbb{N})$ |

${F}_{LS}\left({C}_{Al}\right)$ (Luxury Services) | ${F}_{SO}\wedge Occurrence({C}_{Al},\mathbb{N})$ |

${F}_{Resupply}({C}_{W},{C}_{H})$ | ${F}_{SE}\wedge Occurrence({C}_{H},\mathbb{N})\wedge Proximity({C}_{W},{C}_{H},[0,1000m])$ |

${F}_{AD}({C}_{W},{C}_{P})$ (Access to Drivers) | ${F}_{W}\wedge Occurrence({C}_{P},[1,\infty ])\wedge \phantom{\rule{3.33333pt}{0ex}}(S\_Relation({C}_{W},{C}_{P},\left[\mathit{intersects}\right]))\vee Proximity({C}_{W},{C}_{P},[0,200m])$ |

${F}_{AN}({C}_{W},{C}_{B})$ (Access to Non-drivers) | ${F}_{W}\wedge Occurrence({C}_{B},[1,\infty ])\wedge \phantom{\rule{3.33333pt}{0ex}}(S\_Relation({C}_{W},{C}_{B},\left[\mathit{intersects}\right]))\vee Proximity({C}_{W},{C}_{B},[0,200m])$ |

${F}_{WS}({C}_{H},{C}_{W})$ (Walking Safety) | ${F}_{W}\wedge Occurrence({C}_{H},\mathbb{N})\wedge S\_Relation({C}_{W},{C}_{H},\left[\mathit{disjoint}\right]))$ |

${F}_{WO}({C}_{S},{C}_{A})$ (Well-Organized) | ${F}_{SE}\wedge Occurrence({C}_{A},\mathbb{N})\wedge \phantom{\rule{3.33333pt}{0ex}}S\_Configuration({C}_{S},{C}_{A},\left[clustered\right])$ |

${F}_{VP}({C}_{Av},{C}_{W})$ (Visually Pleasing) | ${F}_{W}\wedge Occurrence({C}_{Av},\mathbb{N})\wedge (S\_Relation({C}_{W},{C}_{H},\left[\mathit{intersects}\right])\vee Proximity({C}_{W},{C}_{Av},[0,200m]))$ |

Scoring Function | |

${F}_{SE}\ast {F}_{W}\ast ({F}_{SD}+{F}_{SO}+{F}_{SA}+{F}_{SG}+{F}_{L}+{F}_{E}+{F}_{LS}+{F}_{AD}+{F}_{AN}+{F}_{R}+{F}_{WS}+{F}_{VP}+{F}_{WO})\ast error$ |

## References

- Hartshorne, R. Perspective on the Nature of Geography; Rand McNally: Chicago, IL, USA, 1959. [Google Scholar]
- Tuan, Y.F. Space and Place: Humanistic Perspective. In Philosophy in Geography; Springer: Dordrecht, The Netherlands, 1979; pp. 387–427. [Google Scholar]
- Goodchild, M.F. Geographical information science. Int. J. Geogr. Inf. Syst.
**1992**, 6, 31–45. [Google Scholar] [CrossRef] - Brown, L.A.; Holmes, J. The delimitation of functional regions, nodal regions, and hierarchies by functional distance approaches. J. Reg. Sci.
**1971**, 11, 57–72. [Google Scholar] [CrossRef] - OECD. Redefining “Urban”: A New Way to Measure Metropolitan Areas; OECD Publishing: Paris, France, 2012. [Google Scholar] [CrossRef]
- Hill, L.L. Core elements of digital gazetteers: Placenames, categories, and footprints. In Proceedings of the International Conference on Theory and Practice of Digital Libraries, Lisbon, Portugal, 18–20 September 2000; Springer: Berlin, Germany, 2000; pp. 280–290. [Google Scholar]
- Purves, R.S.; Clough, P.; Jones, C.B.; Arampatzis, A.; Bucher, B.; Finch, D.; Fu, G.; Joho, H.; Syed, A.K.; Vaid, S.; Yang, B. The design and implementation of SPIRIT: A spatially aware search engine for information retrieval on the Internet. Int. J. Geogr. Inf. Sci.
**2007**, 21, 717–745. [Google Scholar] [CrossRef] - Papadakis, E.; Blaschke, T. Place-based GIS: Functional Space. In Proceedings of the 4th AGILE PhD School, Leeds, UK, 30 October–2 November 2017; Comber, L., Malleson, N., Eds.; CEUR: Aachen, Germany, 2017; Volume 2208. [Google Scholar]
- Boegl, K.; Adlassnig, K.P.; Hayashi, Y.; Rothenfluh, T.E.; Leitich, H. Knowledge acquisition in the fuzzy knowledge representation framework of a medical consultation system. Artif. Intell. Med.
**2004**, 30, 1–26. [Google Scholar] [CrossRef] - Yuan, J.; Zheng, Y.; Xie, X. Discovering regions of different functions in a city using human mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; ACM: New York, NY, USA, 2012; pp. 186–194. [Google Scholar]
- Adams, B.; Janowicz, K. Thematic signatures for cleansing and enriching place-related linked data. Int. J. Geogr. Inf. Sci.
**2015**, 29, 556–579. [Google Scholar] [CrossRef] - Gao, S.; Janowicz, K.; Couclelis, H. Extracting urban functional regions from points of interest and human activities on location-based social networks. Trans. GIS
**2017**, 21, 446–467. [Google Scholar] [CrossRef] - Hobel, H.; Fogliaroni, P.; Frank, A.U. Deriving the Geographic Footprint of Cognitive Regions. In Selected papers of the 19th AGILE Conference on Geographic Information Science, Helsinki, Finland, 14–17 June 2016; Sarjakoski, T., Santos, M.Y., Sarjakoski, L.T., Eds.; Lecture Notes in Geoinformation and Cartography; Springer International Publishing: Cham, Switzerland, 2016; pp. 67–84. [Google Scholar]
- Doshi-Velez, F.; Kim, B. Towards A Rigorous Science of Interpretable Machine Learning. arXiv
**2017**, arXiv:1702.08608. [Google Scholar] - Papadakis, E.; Resch, B.; Blaschke, T. Composition of Place: Towards a Compositional View of Functional Space. Cartogr. Geogr. Inf. Sci.
**2019**. [Google Scholar] [CrossRef] - Janowicz, K.; Keßler, C. The role of ontology in improving gazetteer interaction. Int. J. Geogr. Inf. Sci.
**2008**, 22, 1129–1157. [Google Scholar] [CrossRef] - Scheider, S.; Purves, R. Semantic Place Localization from Narratives. In Proceedings of the First ACM SIGSPATIAL International Workshop on Computational Models of Place, Orlando, FL, USA, 5–8 November 2013; Scheider, S., Adams, B., Janowicz, K., Vasardani, M., Winter, S., Eds.; ACM: New York, NY, USA, 2013; pp. 16–16. [Google Scholar] [CrossRef]
- MacEachren, A.M. Leveraging Big (Geo) Data with (Geo) Visual Analytics: Place as the Next Frontier. In Spatial Data Handling in Big Data Era: Select Papers from the 17th IGU Spatial Data Handling Symposium 2016; Zhou, C., Su, F., Harvey, F., Xu, J., Eds.; Springer Singapore: Singapore, 2017; pp. 139–155. [Google Scholar] [CrossRef]
- Scheider, S.; Janowicz, K. Place reference systems. Appl. Ontol.
**2014**, 9, 97–127. [Google Scholar] [CrossRef] - Papadakis, E.; Resch, B.; Blaschke, T. A Function-based model of Place. In International Conference on GIScience Short Paper Proceedings; California Digital Library: Oakland, CA, USA, 2016. [Google Scholar]
- Hobel, H.; Abdalla, A.; Fogliaroni, P.; Frank, A.U. A Semantic Region Growing Algorithm: Extraction of Urban Settings. In Proceedings of the 18th AGILE Conference on Geographic Information Science, Lisbon, Portugal, 9–12 June 2015; Bação, F., Santos, M.Y., Painho, M., Eds.; Lecture Notes in Geoinformation and Cartography. Springer International Publishing: Cham, Switzerland, 2015; pp. 19–33. [Google Scholar]
- Liu, X.; Andris, C.; Rahimi, S. Place niche and its regional variability: Measuring spatial context patterns for points of interest with representation learning. Comput. Environ. Urban Syst.
**2019**, 75, 146–160. [Google Scholar] [CrossRef] - Tao, H.; Wang, K.; Zhuo, L.; Li, X. Re-examining urban region and inferring regional function based on spatial-temporal interaction. Int. J. Digital Earth
**2019**, 12, 293–310. [Google Scholar] [CrossRef] - Su, S.; Lei, C.; Li, A.; Pi, J.; Cai, Z. Coverage inequality and quality of volunteered geographic features in Chinese cities: Analyzing the associated local characteristics using geographically weighted regression. Appl. Geogr.
**2017**, 78, 78–93. [Google Scholar] [CrossRef] - Noulas, A.; Scellato, S.; Mascolo, C.; Pontil, M. Exploiting Semantic Annotations for Clustering Geographic Areas and Users in Location-based Social Networks. In Proceedings of the 2011 Workshop on the Social Mobile Web, Barcelona, Spain, 21 July 2011; AAAI: Menlo Park, CA, USA, 2011; Volume WS-11-02. [Google Scholar]
- Zhou, X.; Zhang, L. Crowdsourcing functions of the living city from Twitter and Foursquare data. Cartogr. Geogr. Inf. Sci.
**2016**, 43, 393–404. [Google Scholar] [CrossRef] - Zhi, Y.; Li, H.; Wang, D.; Deng, M.; Wang, S.; Gao, J.; Duan, Z.; Liu, Y. Latent spatio-temporal activity structures: A new approach to inferring intra-urban functional regions via social media check-in data. Geo-Spat. Inf. Sci.
**2016**, 19, 94–105. [Google Scholar] [CrossRef] - MacQueen, J. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability; Le Cam, L.M., Neyman, J., Eds.; University of California Press: Berkeley, CA, USA, 1967; Volume 1, pp. 281–297. [Google Scholar]
- Assunção, R.M.; Neves, M.C.; Câmara, G.; Freitas, C.D.C. Efficient regionalization techniques for socio-economic geographical units using minimum spanning trees. Int. J. Geogr. Inform.Sci.
**2006**, 20, 797–811. [Google Scholar] [CrossRef] - Liu, Y.; Liu, X.; Gao, S.; Gong, L.; Kang, C.; Zhi, Y.; Chi, G.; Shi, L. Social Sensing: A New Approach to Understanding Our Socioeconomic Environments. Ann. Assoc. Am. Geogr.
**2015**, 105, 512–530. [Google Scholar] [CrossRef] - Cohn, A.; Gotts, N. The ’Egg-Yolk’ Representation of Regions with Indeterminate Boundaries. In Geographic Objects with Indeterminate Boundaries; Burrough, P.A., Frank, A.U., Eds.; Taylor & Francis: London, UK, 1996; pp. 171–187. [Google Scholar]
- Mai, G.; Janowicz, K.; Hu, Y.; Gao, S.; Zhu, R.; Yan, B.; McKenzie, G.; Uppal, A.; Regalia, B. Collections of Points of Interest: How to Name Them and Why it Matters. In Proceedings of the Spatial Big Data and Machine Learning in GIScience Workshop at GIScience 2018, Melbourne, Australia, 28 August 2018; Raubal, M., Wang, S., Guo, M., Jonietz, D., Kiefer, P., Eds.; ETH: Zurich, Switzerland, 2018; pp. 29–33. [Google Scholar]
- Liu, Y.; Yuan, Y.; Gao, S. Modeling the Vagueness of Areal Geographic Objects: A Categorization System. ISPRS Int. J. Geo-Inf.
**2019**, 8, 306. [Google Scholar] [CrossRef] - Papadakis, E.; Baryannis, G.; Petutschnig, A.; Blaschke, T. Function-Based Search of Place Using Theoretical, Empirical and Probabilistic Patterns. ISPRS Int. Journal Geo-Inf.
**2019**, 8, 92. [Google Scholar] [CrossRef]

**Figure 1.**Overview of the proposed framework fusing knowledge-based and data-driven approaches. Latent Dirichlet allocation (LDA).

**Figure 8.**Results combining data-to-knowledge and knowledge-to-data fusion in the Denver metropolitan area.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Papadakis, E.; Gao, S.; Baryannis, G.
Combining Design Patterns and Topic Modeling to Discover Regions That Support Particular Functionality. *ISPRS Int. J. Geo-Inf.* **2019**, *8*, 385.
https://doi.org/10.3390/ijgi8090385

**AMA Style**

Papadakis E, Gao S, Baryannis G.
Combining Design Patterns and Topic Modeling to Discover Regions That Support Particular Functionality. *ISPRS International Journal of Geo-Information*. 2019; 8(9):385.
https://doi.org/10.3390/ijgi8090385

**Chicago/Turabian Style**

Papadakis, Emmanuel, Song Gao, and George Baryannis.
2019. "Combining Design Patterns and Topic Modeling to Discover Regions That Support Particular Functionality" *ISPRS International Journal of Geo-Information* 8, no. 9: 385.
https://doi.org/10.3390/ijgi8090385