Previous Article in Journal
A ReAct- and RAG-Based Framework for Metadata Generation and Access in Relational Data Warehouse Processes
Previous Article in Special Issue
Reliability of LLM Inference Engines from a Static Perspective: Root Cause Analysis and Repair Suggestion via Natural Language Reports
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Territorial Analysis Based on Data from the Distribution of Taxpayers in Ecuador: A Data Science Approach Using Open Data from the Tax Registry

by
Orlando Mauricio Chuquin-Machangara
*,
Alex Joel Ajila-Masache
,
Gabriela Abigail Villalta-Jimbo
,
Mario Perez
and
Renato M. Toasa
*
Maestría en Big Data y Ciencia de Datos, Universidad Tecnológica Israel, Quito 170516, Ecuador
*
Authors to whom correspondence should be addressed.
Big Data Cogn. Comput. 2026, 10(6), 173; https://doi.org/10.3390/bdcc10060173
Submission received: 5 May 2026 / Revised: 16 May 2026 / Accepted: 21 May 2026 / Published: 29 May 2026

Abstract

Open fiscal data in Ecuador remains largely unexplored beyond basic descriptive reporting, despite its potential for territorial intelligence and fiscal planning. This study examines how taxpayers are distributed across Ecuador’s provinces and economic sectors by applying a Big Data pipeline built on Apache Spark 3.5, PostgreSQL 14/PostGIS 3.2, and Python 3.11 spatial libraries to the SRI Tax Registry, comprising approximately 2.5 million records. The analysis combined K-Means and DBSCAN clustering with spatial autocorrelation methods, including Moran’s Index and LISA, to identify concentration patterns and territorial dependencies. The findings show that 68% of taxpayers are located in three provinces, namely Pichincha (34%), Guayas (24%), and Azuay (10%), with a spatial Gini coefficient of 0.61 reflecting considerable fiscal inequality across the country. A Global Moran’s Index of 0.49 (p < 0.001) confirms that neighboring provinces tend to share similar taxpayer densities, while LISA revealed five High–High clusters in major urban centers and six Low–Low clusters in the Amazon region and northern border. DBSCAN identified 27 spatial groupings, including secondary economic nuclei in cities like Ambato, Riobamba, and Machala that autocorrelation models alone do not capture. The methodology is replicable and offers a practical basis for designing place-based fiscal policies in similar contexts. These results provide tax authorities and regional planners with an empirically grounded, scalable framework for identifying territories with fiscal formalization gaps and designing geographically targeted interventions to reduce territorial inequality in Ecuador and in comparable developing-country contexts.
Keywords: taxpayer distribution; spatial autocorrelation; K-Means clustering; DBSCAN; Moran’s Index; LISA; Big Data; open fiscal data; territorial analysis; Ecuador taxpayer distribution; spatial autocorrelation; K-Means clustering; DBSCAN; Moran’s Index; LISA; Big Data; open fiscal data; territorial analysis; Ecuador

Share and Cite

MDPI and ACS Style

Chuquin-Machangara, O.M.; Ajila-Masache, A.J.; Villalta-Jimbo, G.A.; Perez, M.; Toasa, R.M. Territorial Analysis Based on Data from the Distribution of Taxpayers in Ecuador: A Data Science Approach Using Open Data from the Tax Registry. Big Data Cogn. Comput. 2026, 10, 173. https://doi.org/10.3390/bdcc10060173

AMA Style

Chuquin-Machangara OM, Ajila-Masache AJ, Villalta-Jimbo GA, Perez M, Toasa RM. Territorial Analysis Based on Data from the Distribution of Taxpayers in Ecuador: A Data Science Approach Using Open Data from the Tax Registry. Big Data and Cognitive Computing. 2026; 10(6):173. https://doi.org/10.3390/bdcc10060173

Chicago/Turabian Style

Chuquin-Machangara, Orlando Mauricio, Alex Joel Ajila-Masache, Gabriela Abigail Villalta-Jimbo, Mario Perez, and Renato M. Toasa. 2026. "Territorial Analysis Based on Data from the Distribution of Taxpayers in Ecuador: A Data Science Approach Using Open Data from the Tax Registry" Big Data and Cognitive Computing 10, no. 6: 173. https://doi.org/10.3390/bdcc10060173

APA Style

Chuquin-Machangara, O. M., Ajila-Masache, A. J., Villalta-Jimbo, G. A., Perez, M., & Toasa, R. M. (2026). Territorial Analysis Based on Data from the Distribution of Taxpayers in Ecuador: A Data Science Approach Using Open Data from the Tax Registry. Big Data and Cognitive Computing, 10(6), 173. https://doi.org/10.3390/bdcc10060173

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop