Municipalities in the Czech Republic—Compilation of “a Universal” Dataset

: There have been many changes in the spatial composition and formal delimitation of administrative boundaries of Czech municipalities over the past 30 years. Many municipalities have changed their o ﬃ cial status; they separated into ones that were more independent or were merged with existing ones, or formally redrew their boundaries due to advances in mapping technology. Such changes have made it almost impossible to analyze and visualize the temporal development of selected socioeconomic indicators, in order to deliver spatially coherent and time-comparable results. In this data description, we present an evolution of a unique (geo) dataset comprising of the administrative borders of the Czech municipalities. The uniqueness lies in time and topologically justiﬁed spatial data resulting in a common division of the administrative units at the LAU2 level, valid from 1995 to 2019. Besides the topologically correct spatial representations of municipalities in Czechia, we also provide correspondence tables for each year in the mentioned period, which allows joining tabular statistics to spatial data. The dataset is available as a base layer for further temporal and spatial analyses and visualization of various socioeconomic statistical data.


Summary
The primary motivation for the compilation of the presented datasets comes from the need of the project (spatial differentiation and visualization of geodemographic processes, with a focus on households in an aging society in the Czech Republic) dealing with spatial differentiation and visualization of geodemographic processes in Czechia over the last 25 years. The project goal is to explore various geodemographic processes at a very detailed level, specifically at nomenclature of Territorial Units for Statistics (NUTS) local administrative units 2 (LAU2), commonly used in the European Union for statistical purposes [1]. While most of the national statistical data, especially those coming from censuses, are published for clearly defined administrative units [2], it is essential to have such units for spatial analyses and visualization in the form of spatial data. It is not that complicated to obtain the actual spatial data of administrative boundaries and corresponding statistical indicators for the selected year. However, when we want to explore trends in a more extended period (e.g., five, ten, twenty years), it is becoming a very complex task to prepare and harmonize spatial and statistical data. While we confronted our project vision with the actual state of the data quality, we encountered key issues we had to solve:   1995  6232  2008  6249  1996  6233  2009  6249  1997  6234  2010  6250  1998  6242  2011  6251  1999  6244  2012  6251  2000  6251  2013  6253  2001  6258  2014  6253  2002  6254  2015  6253  2003  6249  2016  6258  2004  6249  2017  6258  2005  6248  2018  6258  2006  6248  2019  6258  2007 6249 --However, we identified four broad problems during data processing, which many researchers share. These problems might present when dealing with administrative boundaries in general: (1) Splitting features: administrative units were divided over-it is clear from Table 1 that the number of administrative units increased from 6232 in 1995 to 6258 in 2019, which is the absolute difference of 26 units. In general, this number is not significant (given the total number of more than 6000 units); however, changes were detected in more than 70 cases over the period due to other discrepancies further described below. This first problem, which causes a need for identification and recoding of administrative units, is illustrated in Figure 3. (2) Merging features: administrative units were merged over time-although the changes in the absolute number of municipalities (Table 1) were not dramatic, we have to keep in mind that the division of administrative units (mentioned in point 1) was blurred by the merging of other ones. Nevertheless, the merged administrative units led us to find them and correct the spatial data in order to maintain a "common denominator" principle. Moreover, we had to select the ID code to be kept in data. The illustration of the second problem is in Figure 4. (3) Re-indexing features: administrative units changed their ID codes-in several cases, the municipalities changed their official status (due to changes in the systematic classification of settlement units) and had to be recoded ( Figure 1). (4) Topology mismatch: spatial details of administrative boundaries changed-as mentioned before, the spatial data were provided in two different levels of detail in terms of administrative boundary precision (Figure 2), which complicated the spatial treatment of data. For instance, due to non-corresponding topology of boundaries, some of the formerly intended geographical information system (GIS) tools could not be applied (e.g., "select by location" tool), and "spatial join" remained as the only option. In general, any spatial-based calculations, such as areal measurements, choropleth map creation (when the area is needed), would be inaccurate. Therefore, this problem has to be solved as well.
Data 2020, 5, x FOR PEER REVIEW 4 of 14 (3) Re-indexing features: administrative units changed their ID codes-in several cases, the municipalities changed their official status (due to changes in the systematic classification of settlement units) and had to be recoded ( Figure 3). (4) Topology mismatch: spatial details of administrative boundaries changed-as mentioned before, the spatial data were provided in two different levels of detail in terms of administrative boundary precision (Figure 4), which complicated the spatial treatment of data. For instance, due to non-corresponding topology of boundaries, some of the formerly intended geographical boundary precision (Figure 4), which complicated the spatial treatment of data. For instance, due to non-corresponding topology of boundaries, some of the formerly intended geographical information system (GIS) tools could not be applied (e.g., "select by location" tool), and "spatial join" remained as the only option. In general, any spatial-based calculations, such as areal measurements, choropleth map creation (when the area is needed), would be inaccurate. Therefore, this problem has to be solved as well. Although all of the problems mentioned above concerned approximately 70 municipalities, they could not be ignored. First, in several cases, they involved large towns and cities or military areas. Therefore, their exclusion from the dataset would diminish the quality of the data and consequent analysis and visualization. Second, the geodemographic dataset intended for joining with spatial data was obtained from the Czech Statistical Office, with extremely detailed characteristics about individual inhabitants. For that reason, all of the calculations and aggregations to administrative units had to be performed with 100% precision (in terms of total counts of inhabitants). Moreover, several specific "spatial" problems had to be treated individually (see details in Section 3.1). To summarize the final output, we prepared a dataset consisting of the spatial representation of aggregated administrative units valid for a period from 1995 to 2019, and the non-spatial part in the form of the correspondence table, where "old" (former) and "new" (based on aggregation) ID codes of municipalities are listed. We work-titled this dataset the "universal" or "superlayer" (we will use "universal" for the rest of the manuscript).
The dataset allows users to use statistical data from any year from the period and to link them with the spatial representation of administrative boundaries. The most significant advantage of the universal is that users can (1) compute derived indexes from any statistical data from 1995 onwards, and (2) analyze time-variability and trends of such data without further need of spatial data curation. An example of a combination of both benefits is in Figure 5 depicting an overall trend from 1995 to 2018 of the vital index [12] in Czech municipalities. This would not be possible without losing  2002  6254  2015  6253  2003  6249  2016  6258  2004  6249  2017  6258  2005  6248  2018  6258  2006  6248  2019  6258  2007 6249 --However, we identified four broad problems during data processing, which many researchers share. These problems might present when dealing with administrative boundaries in general: (1) Splitting features: administrative units were divided over-it is clear from Table 1 that the number of administrative units increased from 6232 in 1995 to 6258 in 2019, which is the absolute difference of 26 units. In general, this number is not significant (given the total number of more than 6000 units); however, changes were detected in more than 70 cases over the period due to other discrepancies further described below. This first problem, which causes a need for identification and recoding of administrative units, is illustrated in Figure 1. (2) Merging features: administrative units were merged over time-although the changes in the absolute number of municipalities (Table 1) were not dramatic, we have to keep in mind that the division of administrative units (mentioned in point 1) was blurred by the merging of other ones. Nevertheless, the merged administrative units led us to find them and correct the spatial data in order to maintain a "common denominator" principle. Moreover, we had to select the ID code to be kept in data. The illustration of the second problem is in Figure 2.  share. These problems might present when dealing with administrative boundaries in general: (1) Splitting features: administrative units were divided over-it is clear from Table 1 that the number of administrative units increased from 6232 in 1995 to 6258 in 2019, which is the absolute difference of 26 units. In general, this number is not significant (given the total number of more than 6000 units); however, changes were detected in more than 70 cases over the period due to other discrepancies further described below. This first problem, which causes a need for identification and recoding of administrative units, is illustrated in Figure 1. (2) Merging features: administrative units were merged over time-although the changes in the absolute number of municipalities (Table 1) were not dramatic, we have to keep in mind that the division of administrative units (mentioned in point 1) was blurred by the merging of other ones. Nevertheless, the merged administrative units led us to find them and correct the spatial data in order to maintain a "common denominator" principle. Moreover, we had to select the ID code to be kept in data. The illustration of the second problem is in Figure 2.  Although all of the problems mentioned above concerned approximately 70 municipalities, they could not be ignored. First, in several cases, they involved large towns and cities or military areas. Therefore, their exclusion from the dataset would diminish the quality of the data and consequent analysis and visualization. Second, the geodemographic dataset intended for joining with spatial data was obtained from the Czech Statistical Office, with extremely detailed characteristics about individual inhabitants. For that reason, all of the calculations and aggregations to administrative units had to be performed with 100% precision (in terms of total counts of inhabitants). Moreover, several specific "spatial" problems had to be treated individually (see details in Section 3.1). To summarize the final output, we prepared a dataset consisting of the spatial representation of aggregated administrative units valid for a period from 1995 to 2019, and the non-spatial part in the form of the correspondence table, where "old" (former) and "new" (based on aggregation) ID codes of municipalities are listed. We work-titled this dataset the "universal" or "superlayer" (we will use "universal" for the rest of the manuscript).
The dataset allows users to use statistical data from any year from the period and to link them with the spatial representation of administrative boundaries. The most significant advantage of the universal is that users can (1) compute derived indexes from any statistical data from 1995 onwards, and (2) analyze time-variability and trends of such data without further need of spatial data curation. An example of a combination of both benefits is in Figure 5 depicting an overall trend from 1995 to 2018 of the vital index [12] in Czech municipalities. This would not be possible without losing valuable information about some municipal units (by excluding them from the map) if the universal layer had not been created. Analogically, it is possible to perform year-to-year comparisons only by joining statistical data in given years with the correspondence table and consequently with spatial data (more in User Notes in Section 4). valuable information about some municipal units (by excluding them from the map) if the universal layer had not been created. Analogically, it is possible to perform year-to-year comparisons only by joining statistical data in given years with the correspondence table and consequently with spatial data (more in User Notes in Section 4). Similarly, other phenomena (data and layers) can be treated in the way we present in this paper, e.g., areas of geological regions, climatic types, and catchment areas. Moreover, we used a retrospective approach, which was challenging, because the current archiving and metadata instructions and tutorials are different than they were 15-20 years ago (if there were any at the time) [13]. Therefore, the presented universal dataset can save other researchers, regardless of the topic they deal with, a significant part of their time on data processing.

Data Description
As indicated in the previous chapter, we created a unique dataset (universal) that is ready to use for other researchers exploring Czech statistical data in the geographical or spatial context. The dataset consists of two parts-spatial and non-spatial data. Both parts are needed for proper linkage of universal with any other statistical data. We strive to keep the simplicity of this data description since we elaborate more on the most important methodological and technical details in the Methods section of this paper.

The Spatial Part of the Dataset
All spatial data treatment was performed in the ArcGIS Pro environment; therefore, the primary data format was the Esri geodatabase. To expand the target user group, we created an open data portal (see description in Section 2.4 and methodological background in the Methods chapter), where data can be downloaded in other formats (Esri shapefile, Geo JavaScript Object Notation (GeoJSON), TopoJSON, Keyhole Markup Language (KML)). All formats are compatible with standard GIS software tools. The resulting universal layer is in a vector format using polygons as the primary geometry representation of administrative boundaries. In total, 6215 records are equal to the number of "common denominators" representing administrative boundaries ( Figure 6). The coordinate system set for the data is S-JTSK_Krovak_East_North (EPSG code 5514). Attributes are reduced to the most necessary ones: Similarly, other phenomena (data and layers) can be treated in the way we present in this paper, e.g., areas of geological regions, climatic types, and catchment areas. Moreover, we used a retrospective approach, which was challenging, because the current archiving and metadata instructions and tutorials are different than they were 15-20 years ago (if there were any at the time) [13]. Therefore, the presented universal dataset can save other researchers, regardless of the topic they deal with, a significant part of their time on data processing.

Data Description
As indicated in the previous chapter, we created a unique dataset (universal) that is ready to use for other researchers exploring Czech statistical data in the geographical or spatial context. The dataset consists of two parts-spatial and non-spatial data. Both parts are needed for proper linkage of universal with any other statistical data. We strive to keep the simplicity of this data description since we elaborate more on the most important methodological and technical details in the Methods section of this paper.

The Spatial Part of the Dataset
All spatial data treatment was performed in the ArcGIS Pro environment; therefore, the primary data format was the Esri geodatabase. To expand the target user group, we created an open data portal (see description in Section 2.3 and methodological background in the Methods chapter), where data can be downloaded in other formats (Esri shapefile, Geo JavaScript Object Notation (GeoJSON), TopoJSON, Keyhole Markup Language (KML)). All formats are compatible with standard GIS software tools. The resulting universal layer is in a vector format using polygons as the primary geometry representation of administrative boundaries. In total, 6215 records are equal to the number of "common denominators" representing administrative boundaries ( Figure 6). The coordinate system set for the data is S-JTSK_Krovak_East_North (EPSG code 5514). Attributes are reduced to the most necessary ones: • Shape geometry (Shape); • Name of a municipality (Name); • New ID code of a municipality (ID_code_n); • Shape Length (Shape_Length); • Shape Area (Shape_Area). • Shape geometry (Shape); • Name of a municipality (Name); • New ID code of a municipality (ID_code_n); • Shape Length (Shape_Length); • Shape Area (Shape_Area). The spatial part of the dataset is a result of the geospatial analysis described in the Methods chapter. Initial provider of the administrative boundaries layer and their centroids is The Czech Office for Surveying, Mapping, and Cadastre (ČÚZK, https://www.cuzk.cz/en).

Non-Spatial Part of the Dataset
The non-spatial part of the dataset represents a correspondence table for a correct ID code assignment. The table serves as a converter between the original municipal codes and the newly assigned ones from the aggregation. The records in the table allow the user to transform any Czech statistical data at the municipal level (LAU2) from 1995 to 2019 into the resulting 6215 units that correspond with the spatial part of our universal dataset. It is to be noted that 6215 features are a result of data curation and represents common denominator units; therefore, the number differs from the numbers in Table 1 (showing official municipality counts). Although the creation of the nonspatial part is more demanding than its spatial part, the final output is straightforward to use. The non-spatial part is in the form of a  The spatial part of the dataset is a result of the geospatial analysis described in the Methods chapter. Initial provider of the administrative boundaries layer and their centroids is The Czech Office for Surveying, Mapping, and Cadastre (ČÚZK, https://www.cuzk.cz/en).

Non-Spatial Part of the Dataset
The non-spatial part of the dataset represents a correspondence table for a correct ID code assignment. The table serves as a converter between the original municipal codes and the newly assigned ones from the aggregation. The records in the table allow the user to transform any Czech statistical data at the municipal level (LAU2) from 1995 to 2019 into the resulting 6215 units that correspond with the spatial part of our universal dataset. It is to be noted that 6215 features are a result of data curation and represents common denominator units; therefore, the number differs from the numbers in Table 1 (showing official municipality counts). Although the creation of the non-spatial part is more demanding than its spatial part, the final output is straightforward to use. The non-spatial part is in the form of a The number of records varies in each sheet based on the total number of former/original municipalities (see Table 1). Thus, one row in the table relates to one municipality (LAU 2) unit. All sheets contain only two columns (Figure 7):

•
Original code of a municipality as provided (ID_code_original); • Newly assigned municipality code (ID_code_new). The table itself is in a single Excel workbook containing 25 sheets, each representing one year from 1995 to 2019 (in the case of CSV and TXT formats, individual files are created for each sheet/year). The number of records varies in each sheet based on the total number of former/original municipalities (see Table 1). Thus, one row in the table relates to one municipality (LAU 2) unit. All sheets contain only two columns (Figure 7):

•
Original code of a municipality as provided (ID_code_original); • Newly assigned municipality code (ID_code_new). The non-spatial part of the dataset resulted from time-consuming individual edits and validation of municipal ID codes in each year. Original ID code refers to the actual situation in a given year; the newly assigned ID code brings the information of the aggregated unit ("common denominator"). Original ID codes were provided by The Czech Statistical Office (CZSO, https://www.czso.cz/csu/czso/home).

Open Data Portal
Since a requirement for providing data as available as possible, the authors decided to follow and "open data" concept. According to The Open Knowledge Foundation [14], open data is "data that can be freely used, shared and built-on by anyone, anywhere, for any purpose".
Data are published on the Internet with no technical or legislative restrictions for users. Data are published from the primary/original source, but for users are available in several formats and open standards. Data should be published to the maximum extent as possible. There are several tools available for publishing open data as open data portals are the standardized solution for both publishing and sharing spatially based data. In general, open data portals are catalogues supporting spatial data formats from a technological point of view; therefore, it is the best solution for publishing unique layers, such as universal dataset, following the open data concept [15].
ArcGIS Hub (formerly the ArcGIS Open Data) is a community platform designed by Esri to share open data with the general public. Esri, a worldwide leader in GIS, provides a complex solution that is fully connected with the ArcGIS platform. It uses Esri Geospatial Cloud, which stores all The non-spatial part of the dataset resulted from time-consuming individual edits and validation of municipal ID codes in each year. Original ID code refers to the actual situation in a given year; the newly assigned ID code brings the information of the aggregated unit ("common denominator"). Original ID codes were provided by The Czech Statistical Office (CZSO, https://www.czso.cz/csu/czso/home).

Open Data Portal
Since a requirement for providing data as available as possible, the authors decided to follow and "open data" concept. According to The Open Knowledge Foundation [14], open data is "data that can be freely used, shared and built-on by anyone, anywhere, for any purpose".
Data are published on the Internet with no technical or legislative restrictions for users. Data are published from the primary/original source, but for users are available in several formats and open standards. Data should be published to the maximum extent as possible. There are several tools available for publishing open data as open data portals are the standardized solution for both publishing and sharing spatially based data. In general, open data portals are catalogues supporting spatial data formats from a technological point of view; therefore, it is the best solution for publishing unique layers, such as universal dataset, following the open data concept [15].
ArcGIS Hub (formerly the ArcGIS Open Data) is a community platform designed by Esri to share open data with the general public. Esri, a worldwide leader in GIS, provides a complex solution that is fully connected with the ArcGIS platform. It uses Esri Geospatial Cloud, which stores all created pages and all data (an advantage for organizations that do not want to manage data on their servers). ArcGIS Hub is designed for sharing both spatial and non-spatial data, which can be visualized directly on the platform using maps, tables, graphs, and the like. The general public can download datasets or their filtered parts in various data formats. Beside datasets, as a complex portal, it allows one to create maps or web applications, search inside ArcGIS platform content, and share content with other members. The portal includes a website for user-friendly access and a map browser. It can be used for creating a simple interactive map application-map overviews.
After and Web Coverage Services (WCS) for covers. ArcGIS Hub allows users to set permissions for workgroup members only or the public.

Methods
This chapter describes the main methodological steps taken during the creation of the universal dataset. The chapter is divided into three sections based on the nature of the data curation. General overview of data curation flow is depicted in Appendix A.

Treatment of the Data from a Spatial Perspective
Because the main goal was to compile a data set for analyzing and visualization of geodemographic indicators over time, the study started with obtaining the spatial data of Czech municipalities (LAU2). As mentioned above, we missed some years from an earlier time in our observed period (1995-2019). Therefore, the filling gaps started with the year 1996, i.e., one year (1995) backwards, and the following years onwards. By doing so, we had to check the changes between years with the use of the CZSO code list and manually change administrative boundaries accordingly if the municipalities merged. In the case of splitting the municipalities, we considered those and recorded their ID codes into the correspondence table. Consecutively, this procedure was repeated until the year 2019. This procedure ensured having administrative boundaries (polygons) spatially constant throughout the observed period (1995-2019).
Additionally, we received the official data fromČÚZK from 2009 onwards in finer spatial detail (see differences in Figure 8a), which forced us to decide what spatial data to use. We decided to keep the former (2008 and backwards) more generalized administrative boundaries because the final visualizations were intended to display the whole Czechia. Otherwise, the finer detail of administrative boundaries would cause additional cartographic problems in map-making (e.g., rendering errors of the final map due to the output map scale).
Data 2020, 5, x FOR PEER REVIEW 9 of 14 unsolved (e.g., Figure 8b,c). In such cases, individual deviations in the data had to be treated manually, unfortunately. Since we did not work with other attribute data than municipality IDs (qualitative information), we could not even apply any of the methods commonly deployed in the modifiable areal unit problem (MAUP) [16], e.g., proportional redistribution of values. Once we had all of the municipal ID codes contained in the polygon representation of administrative boundaries, we applied "spatial join" again for data validation to check the correct assignment of ID codes. In this procedure, the transformation of administrative boundaries (polygons) back into their centroids (points) occurred. As a result, 25 centroids (mostly laying on each other) for each municipality in Czechia were obtained. Consequently, we used "spatial join" again-  As mentioned in the Summary chapter, spatial data obtained fromČÚZK were in a vector format as administrative boundaries (polygons) and their centroids (points with information on municipalities' ID codes). Therefore, it was necessary to link centroids with polygons to get ID codes within the polygons for future connection with other statistical data. From the analytical perspective, the "spatial join" tool served this purpose. In general, "spatial join" tool projects the attribute information from points to polygons based on their mutual geographical location. However, during "spatial join" application, errors emerged when point data carrying information with ID code did not fit within the right administrative boundaries-see Figure 8. Although these problems occurred in a small number of municipalities, they had to be corrected to fulfil the requirement of 100% correct ID assignment for future matching with tabular data.
In summary, the errors were first checked visually, corrected manually, and then cross-checked (and corrected) again after the data validation process. At this point, it is crucial to note that for such data adjustments, no automatic GIS tools could have been applied since automation had not been able to handle such issues in the data sufficiently. Automatic tools (or semi-automatic combination of GIS tools) is indeed beneficial for processing large datasets; however, often leaving some "outliers" unsolved (e.g., Figure 8b,c). In such cases, individual deviations in the data had to be treated manually, unfortunately. Since we did not work with other attribute data than municipality IDs (qualitative information), we could not even apply any of the methods commonly deployed in the modifiable areal unit problem (MAUP) [16], e.g., proportional redistribution of values.
Once we had all of the municipal ID codes contained in the polygon representation of administrative boundaries, we applied "spatial join" again for data validation to check the correct assignment of ID codes. In this procedure, the transformation of administrative boundaries (polygons) back into their centroids (points) occurred. As a result, 25 centroids (mostly laying on each other) for each municipality in Czechia were obtained. Consequently, we used "spatial join" again-target feature was a layer containing edited polygons of administrative boundaries based on the "common denominator" principle and all 25 centroids with ID code attributes. However, this time, we applied a different merge rule for attributes (coincidently, also called "join") in advanced settings of the tool, which allows listing all joined attributes within one record of the target layer. This helped us to: (1) Check the total number of ID codes-one municipality should contain 25 ID codes. If there were more/fewer ID codes, further data inspection and review was necessary. This validation step was in terms of quantity. (2) Check if there were two or more different ID codes-one municipality could have more ID codes, which indicated a split or merger of several municipalities. This validation step was in terms of quality.
All issues identified in spatial data processing were recorded and immediately taken into account in the non-spatial part of the universal dataset.

Treatment of the Data from a Non-Spatial Perspective
Since there were changes (mentioned in points 1 and 2 in the Summary section, and partially described in Section 3.1) emerging in different years, it was necessary to search for them individually in each year. Once the change was detected and identified, it has to be decided which ID code will be maintained. For change detection purposes, a combination of an official document on changes of administrative delimitation of Czechia fromČÚZK, official historical registers from CZSO, results from spatial data treatment, and other internet searches were used. Unfortunately, these sources were not mutually coherent, so individual changes had to be verified individually. In case of units division over time, the former ID code remained in one of those divided, and the newly established municipality received the ID code from the "common denominator" spatial unit with stable boundaries (fortunately, all the former ID code units existed throughout the whole period). Explained by the example-municipality A, divided into A1 and A2, while A1 kept its ID (based on its size or Data 2020, 5, 107 10 of 14 importance within the settlement system) and A2 was assigned with an ID from the former bigger unit (usually the same ID as A1); instead of keeping the new one. If two or more administrative units merged over a given period, recoding was done backwards (all municipalities forming a new, bigger unit, were assigned with the identical ID). In other words, the newly established administrative unit (and its spatial representation) is projected back in time. Again, in the same logic, this newly created unit acted as a "common denominator", therefore, kept in the final dataset.
Thus, every year after a detection of split or merger of the administrative unit, the former code replaces the new one, while the reference information about both codes was maintained. The correspondence table containing both codes represents a final product from non-spatial data preparation. This table allows linking any statistical table with the spatial part of the dataset (see more in the User Notes section).
Methodologically, we applied a combination of available tools in Microsoft Excel (e.g., look up function to search for differences, contingency table to cross-check overall counts, and so on), and the programming language R (functions na.omit and setdiff) in order to find differences between spatial and tabular data automatically. This combination of non-spatial tools was used rather for practical reasons. After initial preparation of the spatial dataset (with the use of geodatabase), demographers required us to deliver a list of municipalities in MS Excel as they commonly work in such tabular environments. The data was then cross-checked with the statistical tables demographers possessed and sent notes for corrections (highlighted in MS Excel) back. This iterative process was, therefore, easier to handle directly in Excel.

Open-Data Portal Design
Sharing universal layer via an open data portal is based on two steps: metadata and geometry preprocessing and publishing. Metadata is structured information that is used to characterize, identify, and interpret each dataset. Metadata is an essential parameter in the field of spatial data to be able to use the relevant data correctly [17]. Metadata should be part of every dataset and web service. In our case, the metadata are characterized as title, description, spatial extent (bounding box), author, date of publishing, license and permission, original source, number of features, and attributes specification (title, format, count statistics). All metadata for universal is available in a standardized format at URL: https://gislib.upol.cz/portal/sharing/rest/content/items/08a30f65288f40ccac2e31fc6ce6b 908/info/metadata/metadata.xml.
If appropriate metadata is implemented, the publishing phase follows. Publishing is the formal and technical process of data publication. After publishing, data are visualized and available for download within the open data concept. Sharing with the public or only team-members is a crucial option. Within the ArcGIS platform, the user is asked for available data formats for download option, in our case, all possible options are available: native Esri spatial geodatabase as the original source, feature collection, CSV, KML, Shapefile, GeoJSON, XLSX, and standardized GeoServices API. All data, applications, and websites created through ArcGIS Open Data Portal are stored in the Esri Geospatial Cloud repository, which means that they are available for administrators for further updates. Open Data Portal of Department of Geoinformatics, Palacký University Olomouc, is available at Supplementary Materials. It contains dozens of datasets divided by categories. Each dataset is available via a specific URL. The case study of the universal layer is available via Supplementary Materials.
The interface of a specific dataset contains spatial, tabular, and attribute segments (see Figure 9). The interactive map provides a general overview of phenomena. It is not a fully-developed application with high-level of interactivity; it does not meet all cartographical rules. It allows the basic functionality of web maps-zoom in/out, pan, and attributes selection by click for data preview. The tabs allow the user to switch among spatial and tabular visualization of the whole dataset. Usually, the web-design is tested with real users before publishing, e.g., by eye-tracking methods [18,19]; however, since the Esri platform is devoted to fast, effective, and easy publication of data, it does not allow advanced options to change the user interface.
allow advanced options to change the user interface.
Moreover, it implements advanced filtering. The second part includes all metadata, including descriptions and attributes. Buttons allowing download in specified formats are fundamental from the users' point of view. If the Esri user is interested in the dataset, he could simply use it for a custom project within ArcGIS platform, directly upload this layer by "Make web map" button, and create highly interactive web map applications.

User Notes
In this section, we present a basic principle on how to connect the universal layer with any Czech statistical data. In order to join any Czech statistical data at the municipal level (LAU2) with the spatial part of our data for consequential spatial analysis or visualization, it is necessary to recode the municipality IDs of the statistical indicator with the use of our correspondence table. There are other ways to combine other Czech statistical data at the municipal level (LAU2) with our universal layer; however, we propose the following. First, it is crucial to recode the chosen Czech statistical data according to the correspondence table, i.e., find municipal ID codes that changed over time, and have the newly assigned value from the aggregation process. Once these records are identified, the calculation of selected indicator values should be performed (usually a simple count is satisfactory). The recoding process is illustrated in Figure 10. As the last step, recoded statistical data could be joined with the spatial part of the universal layer in GIS environment (in ArcGIS Pro by using "join" function). Obviously, the whole process is possible to perform vice versa, i.e., by joining the correspondence table with spatial data, first with duplicated records kept, and then joining the Czech Moreover, it implements advanced filtering. The second part includes all metadata, including descriptions and attributes. Buttons allowing download in specified formats are fundamental from the users' point of view. If the Esri user is interested in the dataset, he could simply use it for a custom project within ArcGIS platform, directly upload this layer by "Make web map" button, and create highly interactive web map applications.

User Notes
In this section, we present a basic principle on how to connect the universal layer with any Czech statistical data. In order to join any Czech statistical data at the municipal level (LAU2) with the spatial part of our data for consequential spatial analysis or visualization, it is necessary to recode the municipality IDs of the statistical indicator with the use of our correspondence table. There are other ways to combine other Czech statistical data at the municipal level (LAU2) with our universal layer; however, we propose the following. First, it is crucial to recode the chosen Czech statistical data according to the correspondence table, i.e., find municipal ID codes that changed over time, and have the newly assigned value from the aggregation process. Once these records are identified, the calculation of selected indicator values should be performed (usually a simple count is satisfactory). The recoding process is illustrated in Figure 10. As the last step, recoded statistical data could be joined with the spatial part of the universal layer in GIS environment (in ArcGIS Pro by using "join" function). Obviously, the whole process is possible to perform vice versa, i.e., by joining the correspondence table with spatial data, first with duplicated records kept, and then joining the Czech statistical data with spatial data. Calculations of the indicator's values could be consequently performed directly in the GIS environment.  Finally, not to forget the availability of statistical indicators, they can be freely obtained on the Czech Statistical Office website: www.czso.cz.