The Development of the Municipal Registry of People with Diabetes in Porto Alegre, Brazil

Background/Objective: Diabetes registries that enhance surveillance and improve medical care are uncommon in low- and middle-income countries, where most of the diabetes burden lies. We aimed to describe the methodological and technical aspects adopted in the development of a municipal registry of people with diabetes using local and national Brazilian National Health System databases. Methods: We obtained data between July 2018 and June 2021 based on eight databases covering primary care, specialty and emergency consultations, medication dispensing, outpatient exam management, hospitalizations, and deaths. We identified diabetes using the International Classification of Disease (ICD), International Classification of Primary Care (ICPC), medications for diabetes, hospital codes for the treatment of diabetes complications, and exams for diabetes management. Results: After data processing and database merging using deterministic and probabilistic linkage, we identified 73,185 people with diabetes. Considering that 1.33 million people live in Porto Alegre, the registry captured 5.5% of the population. Conclusions: With additional data processing, the registry can reveal information on the treatment and outcomes of people with diabetes who are receiving publicly financed care in Porto Alegre. It will provide metrics for epidemiologic surveillance, such as the incidence, prevalence, rates, and trends of complications and causes of mortality; identify inadequacies; and provide information. It will enable healthcare providers to monitor the quality of care, identify inadequacies, and provide feedback as needed.


Introduction
Diabetes mellitus is a significant and escalating public health challenge, particularly in low-and middle-income countries (LMICs).Its burden is rapidly rising in most countries [1].Projections indicate a global increase from 537 million in 2021 to 783 million people living with diabetes in 2045, with 94% of this increase occurring in LMICs [2].In Brazil, the prevalence of diabetes increased by 24% from 2013 to 2019 [3] and is expected to grow from the sixth to the third leading cause of death by 2040 [4].
Enhancing information and communication technology is a pivotal strategy to face the challenges posed by the diabetes burden [1].To that end, real-world data [4,5] obtained from health information systems have been harnessed to elucidate health problems and determinants of population health [6,7].By integrating and pairing multiple health data sources, population registries are often essential in the secondary use of real-world data [8].They yield new information for a given problem in a specific population by leveraging pre-existing information, in accordance with the Global Digital Health Strategy from the World Health Organization, encouraging the use of secondary health data to enhance the quality of healthcare and research effectiveness [9,10].
Population registries related to diabetes have been developed in high-income countries but are uncommon in LMICs [11,12].To the best of our knowledge, functional populationbased diabetes registries constructed with routinely collected health system data are absent in Latin America.In Brazil, information systems pertaining to primary care are now available to complement those already used for hospitalization and deaths [13,14].Brazil's vast geographical expanse and regional specificities have led to municipal initiatives to solve local problems [15,16].Since 2000, public health authorities in Porto Alegre, in southern Brazil, have implemented various information systems complementary to existing national ones to support clinical, regulatory, and health surveillance actions.
Considering the scarcity of population-based records on diabetes care and complications in LMICs and aiming to further qualify municipal healthcare actions, we developed a municipal registry of people with diabetes in Porto Alegre, Brazil.Here, we document the construction of this registry and discuss its potential uses and the challenges ahead for its integration into epidemiological surveillance and clinical care.

Materials and Methods
Porto Alegre is the capital of Rio Grande do Sul, the southernmost state in Brazil, having approximately 1.33 million inhabitants in 2022 [17].Public health provision in Porto Alegre is conducted by the Municipal Health Department (Secretaria Municipal de Saúde [SMS], Porto Alegre, Brazil), which administers local aspects of the Brazilian National Health System (Sistema Único de Saúde, [SUS]) care through about 190 outpatient services, as well as 14 in-house and outsourced hospital services, in total comprising more than 3900 health professionals working in healthcare, patient flow management, and epidemiologic surveillance [18].
It is noteworthy that the information systems and their respective databases complement each other by encompassing various contexts and diverse points of contact between individuals and healthcare services.These range from routine primary care appointments to other health procedures, as well as the administration of medications in outpatient settings, specialized consultations, and diagnostic tests conducted under specific circumstances.Additionally, they encompass hospitalizations, less frequent emergency interventions, and ultimately, terminal outcomes such as mortality, particularly in cases where previous databases failed to identify diabetes-related events throughout the period covered by the registry of an individual's lifespan.A brief description of each data system follows.Only the primary author (Dal Moro, R), as an employee of the Municipal Health Department, had direct access to the data and constructed the necessary linkages.

Outpatient Medications (DIS)
We acquired data on outpatient medications from the Medication Dispensing System (DIS), developed in Porto Alegre and employed in public pharmacies and municipal-level public health system dispensaries since 2017.The DIS was designed to streamline the processes involved in the transport, storage, dispensation, and disposal of medications listed on the Municipal List of Medicines (Lista Municipal de Medicamentos-Remume) [19].The REMUME encompasses the essential medications and resources for diabetes care as defined by Brazilian federal law [20] and the Municipal Program for Distribution for Diabetes Medications and Supplies (Programa Municipal de Distribuição de Insumos para Diabetes, PMDID), a complementary action of pharmaceutical care in Porto Alegre that guarantees access to insulin and supplies for home-based capillary blood glucose monitoring [21].

Primary Care (e-SUS APS)
We gathered primary care encounter data from the Electronic Primary Care Health Record System, developed by the Brazilian Ministry of Health (e-SUS APS) and implemented in Porto Alegre in 2014.This system documents both spontaneous and scheduled encounters provided by primary healthcare facilities.By 2021, the city's primary care network had 142 health services.

Specialized Care and Outpatient Exams (GERCON Consultation and Examination Management Systems)
Data on specialized care were obtained from the Consultation and Examination Management System, an information system conceived and developed in Porto Alegre and utilized since 2016.This system facilitates requesting, regulating, scheduling, and confirming consultations in the specialized healthcare network.In January 2020, the system was expanded to include an outpatient exam module, incorporating a second database into the registry development process.

Hospitalizations (GERINT and SIH)
For hospitalization information, we obtained data from two sources.The first was the Hospitalization Management System (GERINT), a state-level integrated system introduced in 2017 to enhance patient flow management and access to hospital beds, financed by the SUS in Rio Grande, including Porto Alegre.The second was the Hospital Billing System (SIH), developed by the Brazilian Ministry of Health to document the billing and financial resource transfer related to hospital admissions.

Urgent Care (SIHO)
Urgent care data were gathered from a local patient health record system (SIHO) used in Porto Alegre since the mid-2000s.This system records urgent and emergency encounters paid for by the SUS.

Mortality (SIM)
We extracted data on deaths from the Brazilian Mortality Information System (SIM), which the Brazilian Ministry of Health has developed to oversee national mortality data since 1975.

Data Linkage, Database Generation, and Data Management
We chose variables from the eight databases mentioned above to facilitate (a) linkage across healthcare datasets to enable the integration of exposures and outcomes, (b) elaboration of a demographic profile of participants, and (c) identification of diabetes and its complications.Subsequently, we extracted the target data from data servers of the SMS and PROCEMPA, the entity that develops and maintains Porto Alegre's municipal information systems.The databases were stored in a computing environment dedicated to data management.We also implemented data protection procedures for privacy and individual safety, including access control to the data through different privilege levels through a two-factor authentication process.Data usage has been performed strictly for scientific research purposes.We operated the data through an exclusive relational database using PostgreSQL 13.0 in a DBeaver 21.0 integrated development environment for this stage.
Any of the following criteria identified a possible case of diabetes (Table 1): (a) a code for diabetes according to the International Classification of Diseases-version 10 (ICD-10) or the International Classification of Primary Care-version 2 (ICPC-2); (b) a medication prescribed and dispensed to control diabetes; or (c) a procedure code indicating treatment for acute or chronic complications of diabetes, according to the Brazilian public health code for procedures and health materials (Tabela de Procedimentos, Medicamentos e Órteses-Próteses-Materiais Especiais do Sistema Único de Saúde).For death records, we identified diabetes among both the underlying and the contributing causes.We identified people with diabetes separately in each database, considering the occurrence of events between July 2018 and June 2021.The earliest occurrence was selected when a person had two or more occurrences in the same database.Thus, we obtained eight intermediate databases containing a unique record to identify each person who met the criteria for diabetes.We joined these eight intermediate databases to create the municipal registry of people with diabetes.
To compare bases and deduplicates when needed, we used the unique identification number as a deterministic matching key, except for the hospitalization billing database (SIH) and mortality information system (SIM).For these latter bases, we applied the Jaro-Winkler distance algorithm [22] (R Package 4.1.2,64 bits, for Windows; RStudio 2021.09.1.for Windows; Fedmatch package 2.0.4 for R).For comparison, we used the variable names and the date of birth, accepting a minimum similarity value of 0.95 (on a scale from 0 to 1, where 1 indicates total similarity).When the same person was listed in more than one intermediate database, we registered the presence of diabetes based on the database containing the earliest occurrence.

Statistical Methods
Descriptive statistical analyses were conducted, estimating the absolute and percentage frequencies for categorical variables and the means and standard deviations for numerical variables.The analyses were performed using the R Software Package version 4.1.2(64-bit) for Windows and RStudio version 2021.09.1 for Windows.

Results
This version of the municipal registry of people with diabetes in Porto Alegre (July 2018-June 2020) contains the following data for each person: unique identifier number, date of birth, sex, diagnostic code or text (ICD, ICPC-2, medication, or procedure code) and the date diabetes was first registered during the study period in the health system databases.
This initial data processing identified 1,007,850 occurrences of diabetes from 8 databases.The deduplication process yielded 73,185 unique individuals with diabetes.Of these, 33,050 were observed in 1 database, 24,755 in 2 databases, and 15,380 in 3 or more databases.Regarding the identification sources, 37,865 individuals were identified based on antidiabetic medication dispensed; 26,237 on primary care consultations; 3437 from the mortality registry; 2478 through specialized consultations; 1460 via specific laboratory exams; 1297 from hospital admissions or billing; and 411 from urgent care consultations (Figure 1).Considering the estimated population of Porto Alegre in 2020 to be 1.33 million, this represents 5.5% of the population over the three-year period.
J. Clin.Med.2024, 13, x FOR PEER REVIEW 5 of 13 and mortality information system (SIM).For these latter bases, we applied the Jaro-Winkler distance algorithm [22] (R Package 4.1.2,64 bits, for Windows; RStudio 2021.09.1.for Windows; Fedmatch package 2.0.4 for R).For comparison, we used the variable names and the date of birth, accepting a minimum similarity value of 0.95 (on a scale from 0 to 1, where 1 indicates total similarity).When the same person was listed in more than one intermediate database, we registered the presence of diabetes based on the database containing the earliest occurrence.

Statistical Methods
Descriptive statistical analyses were conducted, estimating the absolute and percentage frequencies for categorical variables and the means and standard deviations for numerical variables.The analyses were performed using the R Software Package version 4.1.2(64-bit) for Windows and RStudio version 2021.09.1 for Windows.

Results
This version of the municipal registry of people with diabetes in Porto Alegre (July 2018-June 2020) contains the following data for each person: unique identifier number, date of birth, sex, diagnostic code or text (ICD, ICPC-2, medication, or procedure code) and the date diabetes was first registered during the study period in the health system databases.
This initial data processing identified 1,007,850 occurrences of diabetes from 8 databases.The deduplication process yielded 73,185 unique individuals with diabetes.Of these, 33,050 were observed in 1 database, 24,755 in 2 databases, and 15,380 in 3 or more databases.Regarding the identification sources, 37,865 individuals were identified based on antidiabetic medication dispensed; 26,237 on primary care consultations; 3437 from the mortality registry; 2478 through specialized consultations; 1460 via specific laboratory exams; 1297 from hospital admissions or billing; and 411 from urgent care consultations (Figure 1).Considering the estimated population of Porto Alegre in 2020 to be 1.33 million, this represents 5.5% of the population over the three-year period.The registry was designed to allow future processing and extraction to obtain additional information regarding preventive actions (e.g., ophthalmological evaluation, diabetic foot assessment, tracking of creatinine and microalbumin), smoking habits, control mea-sures (Hb1Ac, glucose level, blood pressure, lipid profile), treatment profile (use of oral medications, insulin use, nutritional monitoring), complications (hypoglycemia, diabetic retinopathy, diabetic nephropathy, diabetic neuropathy, peripheral arterial disease, amputations, hepatic steatosis/fibrosis/cirrhosis, heart diseases, stroke), and both overall and specific mortality.Table 2 shows the feasibility of gathering this information for future analyses based on the 39 items recommended by the International Consortium on Health Outcome Measurement (ICHOM) proposal for data to be included in population registries for diabetes [23].Among the 39 proposed items, 31 (79.5%) can be gathered through further data processing and extraction.Among them, 19 data items are currently feasible, 15 are immediately available for use, and 4 (body mass index, fasting glucose, blood pressure, and diabetic nephropathy) require further data processing-data engineering, extraction, or cleaning actions.Partial feasibility is defined for 12 (30.7%)items, since we may identify the occurrence but not the detail or result.Most (n = 9) are related to laboratory results, as we currently only have an indication that a test was solicited.The Municipality is integrating with clinical laboratories to receive test results, which will then be available for incorporation into the registry.The other three items with partial feasibility (tobacco use and findings from ophthalmologic and foot examinations) will require improvement in the data information systems to inform beyond when an assessment was performed.
For the remaining eight, we have no primary data source (schooling, date of diagnosis, alcohol consumption, physical activity, lifestyle management, nutritional advice, perception of well-being, and depression score); their inclusion will depend on expanding current information systems.It is possible, for example, to obtain a diagnosis of depression with further data extraction.

Discussion
The municipal registry of people with diabetes in Porto Alegre identified 73,185 cases at this first data processing (July 2018 to June 2021), almost all receiving care through the SUS.Most were ascertained through primary care encounters or the medication dispensing process.The register permitted the organization of the available data to assess the burden of diabetes, its clinical management, and its complications.
An initial consideration pertains to the quality of the constructed registry, its comparability (adherence to definitions of diabetes used in registries elsewhere), temporal scope (availability of data throughout the analysis period), and completeness of the information gathered (here, both coverage of those with diabetes and the extent of relevant additional data) [11,24].
The identification strategy we used is comparable to those widely used [25][26][27], involving (a) diagnosis of diabetes according to international code standards, (b) use of medications specific (or nearly so) for the treatment of diabetes, and (c) procedures for the treatment of diabetes complications according to national codification systems.This process for identifying cases of diabetes has not yet undergone a formal validation study.However, our use of standardized approaches, considering that we only assessed structured or pre-coded data, should have produced consistent and valid information [24].Previous studies have shown high consistency in the use of the international classification of diseases to identify cases of diabetes [28].However, information on the use of medications can bring some degree of inconsistency [29], such as the use of metformin to treat prediabetes or other medical conditions.This fact indicates the relevance of future complementary investigations to assess the consistency and accuracy of the assessment of diabetes through recorded medication codes.Analyzing a sample of the original records against the established clinical gold standard investigated through chart review is a further step we plan to undertake to identify our rate of false positives.A capture-recapture analysis should provide an idea of the frequency of missed cases of diabetes.
The registry's temporal scope is currently limited to a relatively short period, making it challenging to analyze temporal trends.We are increasing the years covered in the registry moving forward, thus augmenting the follow-up time necessary to mount an inception diabetes cohort, i.e., a cohort in which participants can be identified based on the care of their diabetes and not by the occurrence of diabetes complications.
In terms of completeness, three aspects merit mention.The first is the completeness and representativeness of the identified population.Except for the national mortality information system database, the databases were only generated from SUS-provided health services.We did not have access to data generated from private healthcare.In 2019, an estimated 49% of people living in Porto Alegre relied exclusively on the SUS for their healthcare, corresponding to about 652,000 people [30].Many who did have private plans had ones with limited coverage and continued to use the SUS for other aspects of care.For example, very few private plans cover the cost of medications, and the SUS provides free essential diabetes medications, including insulin, at clinics and through private pharmacies.
The second is the completeness of the unique personal identification number in each database.A small percentage (5.4%) of records-those of the hospital billings and mortality databases-lacked this identifier, and information about the corresponding events was incorporated into the registry via probabilistic matching.We consider this an acceptable percentage for this first version of the registry.In the foreseeable future, these two national databases will likely switch to include the national unique identification number as a strategy for univocal identification of all public and private health service users [31][32][33].
The third aspect of completeness is related to data captured to characterize people with diabetes.Following the International Consortium on Health Outcome Measurement (ICHOM) proposal for data to be included in population registries for diabetes [23], as shown in Table 2, the feasibility of using this registry for future analyses is good, as it permits access to 79.5% of the 39 proposed items.
Our experience provides important contributions to those seeking to undertake similar health system surveillance registries.From an operational perspective, our registry would not be possible without the development of information systems covering the broad range of activities of the SUS in Porto Alegre.Additionally, our access to a unique identifier in the databases simplified and accelerated the process of pairing and the construction of the registry, reducing the computational burden and effort associated with harmonizing the identification data, as already demonstrated in other population registries of diabetes [34,35].From a strategic point of view, this report demonstrates the feasibility of using real-world secondary data to build new sets of information on the health status of people and populations in a geographically and temporally delimited middle-income country context.A discontinued initiative to create a Brazilian national hypertension and diabetes registry from primary data rather than secondary data like ours likely failed mainly due to the additional burden on primary care personnel of double data entry [36], which is not a problem in our approach.Furthermore, the creation of the registry was only possible through a collaborative effort involving the SUS's local managers and a local university's postgraduate program.This combination produced a critical mass of personnel sufficiently well-versed in diabetes, epidemiology, the workings of the SUS, and information technology.Figure 2 outlines the main steps initially taken to build this registry and the future data processing and extraction contemplated for specific needs and interests, notably involving items for which data gathering is entirely feasible.struction of the registry, reducing the computational burden and effort associated with harmonizing the identification data, as already demonstrated in other population registries of diabetes [34,35].From a strategic point of view, this report demonstrates the feasibility of using real-world secondary data to build new sets of information on the health status of people and populations in a geographically and temporally delimited middleincome country context.A discontinued initiative to create a Brazilian national hypertension and diabetes registry from primary data rather than secondary data like ours likely failed mainly due to the additional burden on primary care personnel of double data entry [36], which is not a problem in our approach.Furthermore, the creation of the registry was only possible through a collaborative effort involving the SUS's local managers and a local university's postgraduate program.This combination produced a critical mass of personnel sufficiently well-versed in diabetes, epidemiology, the workings of the SUS, and information technology.Figure 2 outlines the main steps initially taken to build this registry and the future data processing and extraction contemplated for specific needs and interests, notably involving items for which data gathering is entirely feasible.Considering the continuous qualification and increased data availability, a range of potential new uses can be outlined for the Porto Alegre registry.In the short term, it will be possible to build time series and measure the incidence and prevalence of diabetes, risk factors, acute and chronic complications, and mortality [37-39].In the medium term, monitoring the quality of care and adherence to treatment protocols will be possible as soon Considering the continuous qualification and increased data availability, a range of potential new uses can be outlined for the Porto Alegre registry.In the short term, it will be possible to build time series and measure the incidence and prevalence of diabetes, risk factors, acute and chronic complications, and mortality [37-39].In the medium term, monitoring the quality of care and adherence to treatment protocols will be possible as soon as laboratory test results are available.These results will complement opportunities for quality monitoring based on the already available datasets on medications and the use of health services [40][41][42].Identifying situations of insufficient or inadequate care and alerting professionals and users through digital solutions will also be possible [43][44][45].In the medium to long term, it will be possible to pair the diabetes registry data with geospatial and socioeconomic data to identify areas of the city of greater epidemiological relevance for strengthening local health services and actions [46,47].The information produced makes it possible to immediately characterize the population identified with diabetes and thus provide tools for surveillance and epidemiological monitoring of chronic non-communicable diseases at the local level.Finally, the registry´s construction also opens up possibilities for elaborating additional linked studies on various other diseases and conditions of interest to public health.
It is important to note possible limitations.Many of these are inherent to observational and, especially, secondary data.As already pointed out, the quality of the information produced directly depends on the consistency and completeness of the primary data.Even using structured data, the variability in classification criteria among thousands of health professionals is a limiting factor that generates imprecision in the information.Another limiting aspect is the restriction of the data obtained, which, except for mortality, only comes from the public health system.Although the Brazilian health system provides universal coverage, it does not cover the more expensive medications except in specific cases (e.g., SGLPT2i), and procedures (e.g., elective surgeries), frequently have long waiting lists for which may lead patients to pay for medications and procedures performed in the private sector.Data on the use of these medications and the existence of these procedures are not captured by our databases.These restrictions may prevent a direct extrapolation to the entire local population.As mentioned above, future capture-recapture analyses can estimate the size of the deficit caused by this and other reasons and permit extrapolation to provide estimates of the diabetes prevalence, incidence, and rates of complications.Since diabetes is a disease with slow progression and long duration, the registry's 3-year span currently limits the investigation of the incidence of outcomes.This limitation, however, will diminish over time.Finally, the sustainability of the registry will require further future input of resources.In this regard, a combined effort of the Porto Alegre SMS and the University Federal do Rio Grande do Sul is in motion to create a sound and sustainable foundation for future work.

Conclusions
The Porto Alegre registry of people with diabetes uniquely identified a relevant fraction of those who had care or events related to diabetes in the SUS in Porto Alegre.With the matched data across the diverse databases it provides, multiple research and epidemiological surveillance questions can be answered, notably those related to deaths and hospitalizations, which can provide information on diabetes complications.This experience can constitute a model for constructing data repositories capable of answering other questions regarding the problems and scenarios of health systems and services in Brazil and other low-and middle-income countries.

Figure 1 .
Figure 1.Municipal registry of people with diabetes in Porto Alegre, according to the origin and number of people identified.

Figure 1 .
Figure 1.Municipal registry of people with diabetes in Porto Alegre, according to the origin and number of people identified.

Figure 2 .
Figure 2. Steps used in the construction of a diabetes registry in Porto Alegre, Brazil (2019-2021), with the perspective of building feasible additions for specific needs of diabetes surveillance and healthcare evaluation.

Figure 2 .
Figure 2. Steps used in the construction of a diabetes registry in Porto Alegre, Brazil (2019-2021), with the perspective of building feasible additions for specific needs of diabetes surveillance and healthcare evaluation.

Table 1 .
Information systems, databases, main variables, and criteria applied to identify people with diabetes.
* Data available only since January 2020.

Table 2 .
Current feasibility and availability of items in the diabetes population registry of Porto Alegre recommended for inclusion by the modified International Consortium on Health Outcome Measurement.