Open Government Data Use in the Brazilian States and Federal District Public Administrations

: This research investigates whether, why, and how open government data (OGD) is used and reused by Brazilian state and district public administrations. A new online questionnaire was developed and collected data from 26 of the 27 federation units between June and July 2021. The resulting dataset was cleaned and anonymized. It contains an insight on 158 parameters for 26 federation units explored. This article describes the questionnaire metadata and the methods applied to collect and treat data. The data ﬁle was divided into four sections: respondent proﬁle (identify the respondent and his workplace), OGD use/consumption, what OGD is used for by public administrations, and why OGD is used by public administrations (beneﬁts, barriers, drivers, and barriers to OGD use/reuse). Results provide the state of the play of OGD use/reuse in the federation units administrations. Therefore, they could be used to inform open data policy and decision-making processes. Furthermore, they could be the starting point for discussing how OGD could better support the digital transformation in the public sector.


Introduction
Public institutions produce, collect, and aggregate vast amounts of data and publish it as open (government) data [1]. According to Open Definition, open data can be freely used, reused, modified, and shared by anyone for any purpose. Given its potential, open data provides opportunities for governments worldwide to implement some of their digital transformation processes [2].
Furthermore, open government data (OGD) has the potential to improve operations efficiency, evidence-based and data-driven policymaking and increase transparency, accountability, civil participation, and trust in government. Despite its benefits, OGD has not been extensively adopted in public sector organizations, particularly in developing countries [3]. The literature provides insights into the barriers to OGD adoption, sharing, use, and reuse. Through a global survey, Zuiderwijk and Reuver [4] identified seven barriers to OGD initiatives: functionality and support; inclusiveness; economy, policy, and process; data interpretation; data quality and resources; legislation and access; and sustainability. In another study, Crusoe and Melin [5] conducted a systematic literature review to investigate the OGD barriers. Studies focused on technical, organizational, and legal barrier types, while studies on open data usage and systems were less frequent. Forty-six barriers were categorized to an expanded OGD process (suitability, release, publish, use, and evaluation). The literature also offers insights into determinants that influence the OGD adoption in public sector organizations [3].
However, the use of OGD for government is incipient [6], and studies addressing the use of OGD are scarce [3]. Therefore, this project investigates whether, how, and why open government data is used and reused by the Brazilian states and district public administrations. Thus, a survey was developed to collect information from digital government leaders of Brazilian federation units (FUs) public administrations. The term "use" means data use, reuse, or consumption in opposition to data adoption, release, or publication.
This research project was executed by the University of Minho (UM) and the University of the United Nations (UNU-EGOV), with the support of the Digital Transformation Group of States and Federal District (DF)-(GTD.GOV). The GTD.GOV is a national network that gathers specialists in digital transformation from state and district governments across the country. Its mission is to accelerate Brazilian states and district governments' digital transformation [7]. The group was created by the Brazilian Association of State Entities of Information Technology and Communication (ABEP-TIC) and the National Council of State Secretaries of Administration (CONSAD).
ABEP-TIC brings together all state information technology and communication companies in Brazil. It seeks to influence public policies in all spheres of government to promote and strengthen cooperation among its associates. Furthermore, ABEP-TIC fosters administrative modernization to improve the quality and productivity of state government services [8]. In addition, CONSAD congregates the secretaries of state for administration of all 26 Brazilian States and the federal district to exchange experiences and seek creative solutions to improve public management in Brazil [9].
Brazil has been relatively successful in opening data. Given the increasing deployment of digital services, the Brazilian public administrations have produced an increasing set of government data in open format. According to the State Basic Information Survey (Estadic 2019) [10], all federation units have a transparency portal and publish general administration data in a reuse-friendly format and other formats. For example, 84.0% (21) of the 25 FUs published expenditure information in a more reuse-friendly format, while only 18.5% (5) of the 27 did so for accountability of the Fiscal Responsibility Act (LRF).
In addition, a recent survey on digital transformation trends in Brazilian state governments and the federal district [11] indicated that 19 of the 26 Brazilian states have open data portals. However, only eight have legislation which allow state public administration to share data among its agencies. Since OGD is already published and is an integral part of the digital transformation of public administrations, it is necessary to understand whether, for what, and why OGD is used. Moreover, if they are not used, understand the barriers and factors that could facilitate and drive data use and reuse.
Results provide an overview of OGD use and could inform open data policy and decision-making processes in the states and district public administrations. In addition, they provide the basis for further discussion about how OGD can be used and reused to support the digital transformation of the public sector.
The remaining of this article is organized as follows. First, Section 2 presents the methods applied to collect and treat the data. Next, Section 3 describes the questionnaire. Finally, the results and conclusion are presented in Section 4.

Methods
This project (CEICSH 069/2021) was submitted to and approved by the Ethics Committee for Research in Social and Human Sciences (CEICSH) of the University of Minho (UM) on 22 June 2021. Personal data, such as IP address, emails, gender, and age, were collected. However, these data were anonymized to prevent individual respondent identification.
Our objective is to investigate whether, how, and why the public sector uses open government data. Therefore, given the exploratory, inductive nature of the work, we adopted the survey research method. It is used for collecting data from a representative sample of individuals to describe the behaviors, thoughts, and attitudes at a specific place and time. An instrument (questionnaire) composed of closed-ended and openended questions was developed as discussed in [12]. A systematic review of the literature, conducted according to [13,14] served to identify research gaps and support the creation of the survey.
From the literature review [15] we collected and systematized the categories of data used and what OGD is used for. We also gathered OGD use benefits, barriers, drivers, and enablers (BBDE) reported in these studies. Therefore, the following categories to classify BBDE were synthesized: political and social, legal and public policy, cultural, economic and financial, organizational and institutional, and technical and operational). Finally, questionnaire questions were developed based on the BBDE literature reported in [15], which adopted the abovementioned BBDE categorization. In summary, the review was the foundation of this survey and its online questionnaire. Table 1 presents additional information on the questionnaire creation and application. Table 1. Additional information about the questionnaire.

Questionnaire Description
Application mode Unsupervised online tool using LimeSurvey tool Version 3.25.6+201229. Respondents answered the questionnaire by themselves using the LimeSurvey software.
Research of the relevant literature The uses, benefits, barriers, drivers, and enablers (BBDE) listed are derived from a systematic literature review [15].

Construction of the questionnaire
The new questionnaire consists of 39 questions divided into seven groups. Questions and grouping categories were derived from a review of mainly English literature. Then, the questionnaire was developed in the LimeSurvey software, which supports the creation of multilingual surveys. The questionnaire was created in English and then translated to Portuguese. Therefore, two versions of the same questionnaire exist, one in English and another in Portuguese. Metadata is available in both languages. Worth noting that respondents answered the Portuguese questionnaire version. Therefore, data was collected in Portuguese. The English version is used to report results.

Pretest
The questionnaire in Portuguese was applied to a sample of five people working with OGD (faculty members in higher education and master level students in Brazil) between April and May 2021. As a result, questions were improved for language use and content. In addition, the logic flow was adjusted before the final application based on pretest results and suggestions. The English version was also adjusted to match the tested version.

Sampling method
Non-probabilistic [16]. The GTD.GOV indicated their focal points: Brazilian state and district administration secretaries, managers responsible for the open data area in the state administration, and state and district public managers who use OGD. The "snowball" strategy [17], which requests the indication of additional participants to the respondents, was also used.

Sample size
The GTD.GOV selected 49 digital transformation state and district officials that were invited by email to contribute with the survey. A total of 61 responses were collected. Thirty responses were incomplete and were removed from the sample. Thirty-one responses were complete (the last page column equals 7, the number of pages of the questionnaire). All 27 federation units were invited to participate in this research, and 26 answered the questionnaire. The exception was the State of São Paulo, which declined to respond the questionnaire. Consequently, the dataset does not contain any data for São Paulo. This survey collected data from several State secretariats and agencies, such as administration, planning, information technology, and internal control. Their responses offer the state perspective as the leaders of the State's digital transformation, and open data policy is allocated in them. In the case of states with more than one response, the Secretariat of Administration or Planning record was selected because these secretariats are responsible for the open data policy. They are also and are the loci of CONSAD focal points. Moreover, in the case of more than one complete response per respondent, only the last data (the newest date last action column) was kept. Therefore, this sample contains 26 records, one response per federation unit, which represent the answer of the FU.

Data Cleaning
After applying the questionnaire on the LimeSurvey platform, responses were exported and consolidated in an Excel spreadsheet in a tabular format. Data were collected in Portuguese. Data were cleaned and adjusted to fit in a row (regroup fields separated during the import to Excel process due to the use of commas in text input by respondents). Metadata in Portuguese and English were also imported and added to the spreadsheet. Then, answers to close-ended questions were translated to English and placed in a spreadsheet with English metadata. Open-ended questions which hold respondent inputted data were not translated to preserve originality and avoid introducing researchers' biases. The dataset is available as a CSV file in an open format [18]. The dataset is composed of five CSV files and one Excel (XLST) file, as detailed in the Supplementary Material.

Data Description
This section describes the data collected according to the method presented in Section 3. Each column of the data table, what data are contained, their format, how to read and interpret data are defined.
The aim of this research is to investigate whether, for what, and why the public sector uses/consumes OGD. To collect information, a survey invited Brazilian State and District government leaders to respond to a new online questionnaire that is composed of four analytical sections, as shown in Table 2. The questionnaire was applied between 10 June and 9 July 2021. The resulting dataset comprises 26 rows, one for each federation unit excepting the State of São Paulo, which was the only State that declined to participate in the survey. Thus, this dataset represents the perspective of the 26-participating federation units. Responses were exported to Excel in a tabular format. The dataset contents are summarized in Table 3.

Column naming convention
The questionnaire is composed of simple and multiple-choice questions. In addition, some of the questions include an extra field for text comments.
Example "Q005. In which public context does your institution/agency operate? If municipal or regional, please inform it in the comments field" QuestionId Q005 QuestionLabel "In which public context your institution/agency operates? If municipal or regional, please inform it in the comments field" CommentLabel "In which public context your institution/agency operates? If municipal or regional, please inform it in the comments field" [Comment text] Holds the text comment Table 3. Cont.

Dataset metadata description
Each column is described in the remaining of this section. As the metadata refers to the questionnaire questions, many column names are self-explaining. Since the dataset is in tabular format, the following metadata is presented: column name, a short description (when needed), datatype, possible data values, mandatory/optional indicator. This information is also provided in the dataset in two sheets: Metadata_PT and Metadata_EN. In the first sheet, metadata is provided in Portuguese, and the second in English.
Metadata was divided and is presented into seven tables. Data generated and collected by the system are in Table 4. The remaining tables correspond to data collected using the questionnaire. The respondent profile is listed in Table 5. Table 6 contains data about whether open government data (OGD) are used and for what data are used. Tables 7-10 show respectively the questions related to OGD use benefits, barriers, enablers, and drivers. Columns are presented in the same order they appear in the dataset. Table 4 presents information generated by the LimeSurvey software. All columns are mandatory. Note that the column ipaddr. IP address, which has the IP address of the respondent's device, was anonymized.  Table 5 shows the questions of the respondent profile section. All columns are mandatory, except for the respondent's name, email, and Q005 [comment]. Due to the small size of this sample, it would be possible to identify respondents based on age and gender. Consequently, the following columns were anonymized Q00. Name, Q001. Email, Q002. Age, and Q003. Gender.  Table 6 describes three blocks of questions: if data are used, what data are used for, and why these data are not used. Each question corresponds to a column in the dataset. Q100 determines whether OGD is used. Then, if data are used, Q106, Q105, and Q104 explore what data are used for, the categories of data consumed, and the institutions that provide OGD. These questions were reversed to improve the questionnaire logic flow. Lastly, if OGD is not used, Q150 and Q151 investigate the reasons for not using OGD.
Q155 implements the snowball sampling strategy. It requests the respondent to inform the email of people who could collaborate with the study if the person could not. Q155.
[Comment] was anonymized as it may have email addresses.   Field to inform the email of additional respondents Text Table 7 presents the metadata of questions related to the benefits of using open government data (OGD) in the public sector. Benefits were grouped in three categories: political and social (BEPS), economic (BEE), and operational and technical (BEOT). Additionally, BEOTHER indicates the benefits suggested by the respondent. The last question in this table, Q160, investigates the negative impacts or effects of using OGD in the public sector.    Table 8 displays metadata of barriers to OGD use in the public sector questions. Barriers' questions were categorized as cultural (BAC), economic and financial (BAEF), policy and legal (BAPL), organizational and institutional (BAOI), and operational and technical (BAOT). Additionally, BAOTHER indicates barriers suggested by the respondent.   The questions about enablers of OGD use in the public sector are presented in Table 9. They were grouped into in the same categories listed for OGD use barriers, i.e., cultural (EC), economic and financial (EEF), policy and legal (EPL), organizational and institutional (EOI), and operational and technical (EOT). Additionally, EOTHER holds enablers suggested by the respondent.  Drivers of OGD use in the public sector are listed in Table 10. As with the previous questions, they were categorized into organizational and institutional (DOI), political and social (DPS), and operational and technical (DOT). Additionally, DOTHER holds drivers suggested by the respondent. The final considerations section CF01 asks if the respondent wants to comment on the OGD use.

Results, Limitations, and Conclusions
This research aims to answer whether, how, and why OGD is used in the Brazilian states and District public administrations. Figure 1 shows that 26 out of the 27 Federation Units participated in the survey. Only the State of São Paulo did not respond to the questionnaire. Moreover, the map indicates that 16 States use/reuse OGD. The respondent of one State (Minas Gerais) reported not knowing whether OGD was used. The remaining nine federation units informed that OGD was not used. Therefore, 61% of the States' administrations use OGD, 35% do not use it, and 4% do not know whether these data are used. Regarding how OGD is used, we conclude that data are mainly used to support decision-making, create/improve public services, and analyze data (create analyses, forecasts, estimates, simulations, models). The most popular OGD categories mentioned by 63% of respondents are procurement and bids, expenses, and budget. The most cited data sources were the state secretariats and the Brazilian Institute of Geography and Statistics (IBGE).
The question of why OGD is used was addressed in terms of what OGD use benefits, barriers, facilitators, and drivers are. The most prominent benefits reported were increased transparency, more informed citizens, increased efficiency of the administration, and a more informed decision-making process. The most significant OGD use barriers were "the administration and public managers do not know what open data are" and "the absence of an organizational culture favorable to open data". Regarding facilitating factors, the most relevant facilitators were "data were improved" and "the existence of a cooperative work culture". Finally, the most important drivers were "data are available in easy-to-use formats" and "external stakeholders (international bodies, other agencies, journalists) pressure the administration to use the open data".
It is worth noting that the reported results present a partial view of the use of OGD, as only one response per federation unit was considered. Thus, other secretaries and state agencies should be surveyed to acquire a broader view of OGD use and reuse in the Brazilian state and district administrations. However, public managers can use these results as a starting point to inform decision-making regarding open data policy and digital governance.
Furthermore, the usefulness of this dataset could be verified by developing indicators to measure its utilization rate using data collected by the data repository about downloads and reads.
Supplementary Materials: The following supporting information is available online at https://doi.org/10.34622/datarepositorium/UY7MFA, Table S1: Information about the dataset, Table  S2: Dataset in Portuguese, Table S3: Metadata in Portuguese, Table S4: Dataset in English, Table S5: Metadata in English, and Table S6: Excel spreadsheet with all data and metadata.  Regarding how OGD is used, we conclude that data are mainly used to support decision-making, create/improve public services, and analyze data (create analyses, forecasts, estimates, simulations, models). The most popular OGD categories mentioned by 63% of respondents are procurement and bids, expenses, and budget. The most cited data sources were the state secretariats and the Brazilian Institute of Geography and Statistics (IBGE).
The question of why OGD is used was addressed in terms of what OGD use benefits, barriers, facilitators, and drivers are. The most prominent benefits reported were increased transparency, more informed citizens, increased efficiency of the administration, and a more informed decision-making process. The most significant OGD use barriers were "the administration and public managers do not know what open data are" and "the absence of an organizational culture favorable to open data". Regarding facilitating factors, the most relevant facilitators were "data were improved" and "the existence of a cooperative work culture". Finally, the most important drivers were "data are available in easy-touse formats" and "external stakeholders (international bodies, other agencies, journalists) pressure the administration to use the open data".
It is worth noting that the reported results present a partial view of the use of OGD, as only one response per federation unit was considered. Thus, other secretaries and state agencies should be surveyed to acquire a broader view of OGD use and reuse in the Brazilian state and district administrations. However, public managers can use these results as a starting point to inform decision-making regarding open data policy and digital governance.
Furthermore, the usefulness of this dataset could be verified by developing indicators to measure its utilization rate using data collected by the data repository about downloads and reads.
Supplementary Materials: The following supporting information is available online at https://doi.org/ 10.34622/datarepositorium/UY7MFA, Table S1: Information about the dataset, Table S2: Dataset in Portuguese, Table S3: Metadata in Portuguese, Table S4: Dataset in English, Table S5: Metadata in English, and Table S6: Excel spreadsheet with all data and metadata.

Informed Consent Statement:
The following statement was presented to respondents so to acquire their consent to participate in the survey: "Please read this consent form carefully before deciding to participate in the study. Time required: It takes around about 20 minutes to answer this questionnaire. Risks: The risks associated with participating in this study are minimal. Benefits: The benefits associated with this study are related to acquiring new knowledge and insights about the processes and procedures adopted by the Brazilian states and federal district public administrations in using open data to promote greater transparency and efficiency using these data. Confidentiality: The University of Minho (UM), based in R. 4710-057, Braga Portugal, under the General Data Protection Law of Brazil (LGPD), Law 13.709/2018, collects the personal data requested in this form to complement the analysis of how open government data are used by the public sector, and ensures that it will use this information exclusively for this purpose and after fulfilling the purposes the data will be erased. UM has adopted the best practices and organizational techniques to protect your personal data, as well as guarantee the exercise of your rights of access, rectification, and opposition, through the email: ikawashi@gmail.com. I authorize the use of the data provided herein in the context of the study of the use of public sector open government data that is being conducted by UM and UNU-EGOV in collaboration with the GTD.GOV. Voluntary Participation: Your participation in the study is entirely voluntary".

Conflicts of Interest:
The authors declare no conflict of interest.