Internationalization in the Baltic Regional Accounts: A NUTS 3 Region Dataset

: Features of internationalization, such as trade, foreign direct investments, and international migration, are crucial for understanding the economic developments of small and open economies. However, studying internationalization at the country level may obscure signiﬁcant heterogeneity in its relationship with economic growth and other economic and social outcomes. Regional accounts provide insights into the geography of internationalization, but collections of such disaggregated statistics are rarely provided by statistical bureaus. The purpose of this paper is twofold. First, we demonstrate how regional account data, including internationalization indicators, can be constructed to obtain consistent and homogeneous regional-level series using a combination of micro and macro data sources. Second, our aim is to foster spatial research on internationalization and the spatial economy in the Baltics by providing comprehensive data collection of socio-economic variables at the NUTS 3 regional level over time. This collection encompasses trade, FDI, and migration, enabling the study of internationalization and other features of the Baltic economy. We present a series of key features, revealing noticeable correlation patterns between regional development and internationalization. Dataset: https://zenodo.org/records/10223416. Dataset


Summary
Internationalization plays an essential role in the economic development of open economies, but the spatial dimensions within each economy are less frequently assessed.Regional accounts may shed light on the domestic geographies of internationalization. Yet, they are often inaccessible and seldom contain internationalization variables.Despite the significant economic growth and regional internationalization over the last decades, few spatial studies on internationalization stand out for the Baltic states.Simultaneously, regional internationalization data, such as trade and foreign investment data, remain scarce [1].
This paper has two objectives.First, we aim to demonstrate how regional account data, including internationalization indicators, can be processed and analyzed through descriptive statistics and by providing a dataset suitable for further investigations.We pay particular attention to internationalization proxies, including trade, foreign direct investments, and migration, as these variables are typically less available in regional accounts.In our facilitation of studies on internationalization across the Baltic regions, we provide descriptive statistics and a dataset containing the Baltic regional accounts.We reveal some key correlation patterns concerning regional development and internationalization through descriptive statistics.Our objective here is not to determine the causality of the linkages but rather to identify patterns worthy of further investigation.Moreover, causal links may run in various directions, and many cofounders may come into play.
We document the collection and processing of internationalization variables.In our dataset, internationalization is represented by trade (i.e., export and imports), foreign direct investments (FDI) in equity stocks (both outgoing and incoming), and international migration (i.e., emigration and immigration).Among these, trade and migration constitute flows, while the changes in the FDI stocks correspond to the sum of FDI flows, excluding dividends and revaluation after dividends.
Moreover, the collected and processed dataset includes a wide range of variables.The data are sourced from various providers, with domestic firm data, Eurostat, Global Data Hub, and OECD being the most commonly used sources.To ensure comparability across countries, the variables are harmonized across international sources.Additionally, we utilize alternative sets of deflators to obtain fixed-price figures.For recent years, migration data were available, but adjustments to figures related to population changes have been necessary to achieve a more accurate representation.The collection and processing of this dataset are documented in this study, illustrating how such datasets may be collected.
Our contributions to domestic variables are mostly limited to data collection, data harmonization, handling of measurement errors, and filling out data gaps.Yet, some contributions to the processing of domestic data are worth mentioning explicitly here.Most notably, we estimate the fixed capital stock for all three countries, exploiting a combination of firm-level data and macro data.In a regional account setting, the appearance of the fixed capital stock variable resembles FDI and trade variables somewhat in the sense that it is seldom reported but could be approximated by the help of firm data and macro data.Furthermore, our collection of deflators also allows for the calculation of volume figures.
At last, we pinpoint that the harmonization of some data series required more extensive processing than others.For the Estonian population data, we adjust for a data break between 2014 and 2015 in the years prior to 2015, such that the population figures for this period are comparable with the population figures in the sequent period.To address measurement errors in survey-based broadband data, we assume that broadband coverage cannot decrease and replace extreme outliers with their interpolated values.
The descriptive statistics presented in this paper depict close linkages between internationalization and economic development in many instances, but the correlation patterns are not always strong.It is interesting to see that Estonia's position as the most advanced Baltic economy is largely due to the dominant role of the capital region of Põhja-Eesti.Along the center-periphery, the Baltic NUTS 3 regions (i.e., regions at the third level of Eurostat's nomenclature of territorial units for statistics) resemble each other across countries in many respects.Regarding overall economic activities, internationalization is more pronounced in most urban regions.Exports and international immigration involve the most considerable exceptions from this pattern, Telšiai and Klaip ėda being the most exporting Baltic regions per employment and Šiauliai being the Baltic region with the highest international immigration per capita.Overall, FDI and trade tend to be highly correlated with the indicators of economic performance, human development, and knowledge generation, whereas the correlations between migration and the same indicators, for the most part, are weaker.
To help facilitate further research on internationalization and other features of the Baltic spatial economy, a dataset with the regional Baltic accounts at the NUTS 3 level, inter alia including internationalization variables, accompanies this paper as Supplementary Materials.By exploiting a combination of firm-level data and macro data on FDI, trade, and fixed capital, we are able to approximate these variables across the Baltic regions.Our methodological demonstration exemplifies how different data possibilities call for different data collection and processing methodologies.While some data at the regional level can be collected from macro sources or estimated from micro sources, others must be approximated by a combination of variables from various data sources.To ensure comparability across countries, we have harmonized the collected figures with corresponding figures in macro statistics.
Admittedly, approximation of variables and missing or unreliable observations will never be of similar quality as the true data.While some approximations were made, the data are still expected to be of relatively high quality.As long as the measurement errors in approximated elements are not systematically biased in the same direction, they will only add noise to the analysis.More data collection and processing resources can always contribute to higher data quality.Still, our data handling illustrates how decent data quality can be achieved with reasonable effort.Overall, we believe that our dataset is both suited for achieving a good overview of key socio-economic conditions in the Baltic region and for further econometric investigations into key relationships.
For future research, we hope for more studies on internationalization and the spatial economy, particularly in the Baltics and with the help of our dataset.We also welcome more contributions on how regional accounts data can easily be collected and processed.

Data Description
In this section, we document the content of our Baltic regional accounts.All variables cover at least 2007 to 2015 and mostly up to 2019, but some variables cover a longer period from 1991 to 2020.
We start by presenting the Baltic NUTS 3 regions and applying the industry classification.Then, we move on to documentation of the data content, listing each relevant variable.

Classifications for Regions and Industries
The Baltic countries contain 21 NUTS 3 regions, distributed over 11 in Lithuania, 6 in Latvia, and 5 in Estonia.In Figure 1, we provide a map of the Baltic NUTS regions, which lies as the foundation for our mapping.
Data 2023, 8, x FOR PEER REVIEW 3 of 32 Supplementary Materials.By exploiting a combination of firm-level data and macro data on FDI, trade, and fixed capital, we are able to approximate these variables across the Baltic regions.
Our methodological demonstration exemplifies how different data possibilities call for different data collection and processing methodologies.While some data at the regional level can be collected from macro sources or estimated from micro sources, others must be approximated by a combination of variables from various data sources.To ensure comparability across countries, we have harmonized the collected figures with corresponding figures in macro statistics.
Admittedly, approximation of variables and missing or unreliable observations will never be of similar quality as the true data.While some approximations were made, the data are still expected to be of relatively high quality.As long as the measurement errors in approximated elements are not systematically biased in the same direction, they will only add noise to the analysis.More data collection and processing resources can always contribute to higher data quality.Still, our data handling illustrates how decent data quality can be achieved with reasonable effort.Overall, we believe that our dataset is both suited for achieving a good overview of key socio-economic conditions in the Baltic region and for further econometric investigations into key relationships.
For future research, we hope for more studies on internationalization and the spatial economy, particularly in the Baltics and with the help of our dataset.We also welcome more contributions on how regional accounts data can easily be collected and processed.

Data Description
In this section, we document the content of our Baltic regional accounts.All variables cover at least 2007 to 2015 and mostly up to 2019, but some variables cover a longer period from 1991 to 2020.
We start by presenting the Baltic NUTS 3 regions and applying the industry classification.Then, we move on to documentation of the data content, listing each relevant variable.

Classifications for Regions and Industries
The Baltic countries contain 21 NUTS 3 regions, distributed over 11 in Lithuania, 6 in Latvia, and 5 in Estonia.In Figure 1, we provide a map of the Baltic NUTS regions, which lies as the foundation for our mapping.In our study, we process data using Eurostat and OECD's industry classification applied to the regional statistics.This classification involves 11 industries and is rendered in Table 1.In addition to reporting employment and gross value added in basic prices with affiliated deflators for each industry, we exploit the industry classification in the running account estimation of some variables (i.e., outgoing FDI, exports, imports, and fixed capital).

Content of the Dataset
The content of our Baltic regional account dataset is listed in Table 2, including indications of measurement.Note that the internationalization figures and fixed capital are the most processed and estimated among the variables.The other variables can largely be collected from other statistical sources, although some data cleaning will be required.Further details about each variable of the dataset are provided in the metadata of the dataset.

Methods
In this section, we document our collection and processing of data for the Baltic regions.As data access varies, different collection and processing methodologies for each country are applied for some variables.We start by devoting our attention to the internationalization variables, as the data processing required for collecting these variables is relatively extensive.Then, we move on to domestic variables.In Appendix A, we provide technical notes on the estimation of variables missing at the regional level and on data cleaning.

Collection and Processing of Internationalization Variables
In this subsection, we document the collection and processing of internationalization variables.The internationalization variables in the Baltic regional accounts, presented in this paper, include trade (i.e., export and imports), FDI in equity stocks (both outgoing and incoming), and international migration (i.e., emigration and immigration).Among these, trade and migration are reported as flow variables, while FDI is reported as stocks.The FDI stocks correspond to the sum of FDI flows, excluding dividends and revaluation after dividends.When applied in the running calculation, trade and FDI figures are reported in USD and international data sources are converted to EUR using the annual exchange rates obtained from the central bank of Norway.
The collection and processing of international migration data are summarized at the top of Table 3.For all countries, registered figures for emigration and immigration can be collected from the national statistical bureaus, except for Latvia, from 2007 to 2010.In the latter case, the statistics are approximated based on the regional distribution of net migration in absolute values and national figures for international migration.

International emigration and immigration
Measured in number of people from 2006 to 2019

Estonia
International emigration and immigration figures at the regional level are obtained from Statistics Estonia.Unexplained population growth is ascribed to international immigration and emigration, when it is positive and negative, respectively.We make an exception for a data break between 2014 and 2015, where many people are moved in the registers from one region to another.We approximate this error by the national average of the positive and negative residuals in absolute value multiplied by the regional share of the residual with the same sign.

Latvia
International emigration and immigration figures at the regional level are obtained from the Official Statistics of Latvia for 2010 to 2018.For 2007 and 2010 only, we assume that the regional shares of both emigration and immigration correspond to the absolute regional value of population change divided by the total value of population change.Unexplained population growth is ascribed to international immigration when it is positive and international emigration when it is negative.

Lithuania
International emigration and immigration figures at the regional level are obtained from Statistics Lithuania.We ascribe unexplained population growth to international immigration or emigration, depending on whether the sign is positive or negative, respectively.

Exports and imports
Current and fixed 2010 prices, deflated with national price indexes for exports and imports, respectively.Measured in Million euros from 2007 to 2019

Estonia
We approximate the regional distribution of exports and imports with trade figures from the Statistic Estonia firm-market-product level commodity trade data based on customs statistics and firm-level services trade data from the Bank of Estonia.Regional trade is scaled annually to ensure that the aggregate trade figures match the national figures reported by the WTO for each year.

Latvia
We approximate the regional distribution of commodity trade with trade figures in firm data from the Central Statistical Bureau of Latvia's 'Complete report on activities'.For service trade, we assume that the distribution of regional trade is the same on an annual basis per gross value added within the relevant industries, recognized as construction and service industries, excluding wholesale and retail trade.Service trade figures are not available at the micro level.We obtain each industry's service trade at the national level from OECD's annual input-output matrixes for transactions between Latvian industries and sectors, enabling us to estimate regional figures.Commodity trade and service trade are scaled separately on an annual basis to match the aggregate trade figures of the WTO, before being aggregated.

Lithuania
Regional direct export and national re-export are collected from Statistics Lithuania.To obtain total export figures for each region, we assume that the re-export follows the same regional distribution as the direct export.However, Statistics Lithuania only reports national import figures.To overcome this challenge, we assume that the distribution of regional imports is the same per gross value added within each industry and year.We obtain each industry's import at the national level from OECD's annual input-output matrixes for transactions between Lithuanian industries and sectors, enabling us to estimate regional imports.Regional trade is scaled to match the aggregate trade figures reported by the WTO annually.

Incoming foreign direct investments
Equity stocks, including reinvestments of earnings in current prices and fixed 2010 prices, deflated with price indexes for either the domestic product or total end use at the national or European Union level.Measured in Million euros from 2007 to 2019

Estonia
We approximate the regional distribution of incoming FDI with FDI figures in the enterprise register data.

Latvia
We approximate the regional distribution of outgoing FDI with the help of the Central Statistical Bureau of Latvia's 'complete report on activities and firm financial data provided by the State Revenue Service of the Republic of Latvia, applying two equally weighted proxies for the magnitude of each investment.These are total assets in subsidiaries and the product between total assets in subsidiaries and a dummy indicating outgoing FDI.Population changes can be divided into natural population growth, domestic migration, and international migration.Except for a data break in Estonia in 2014, unexplained population growth using this equation is ascribed to global immigration.This choice is made as positive residuals tend to be correlated with the European migration flows (e.g., the Syrian refugee crisis and Brexit), and international refugee flows and working migration back and forth tend to occasionally be unregistered.
Further down in the table, we provide an overview of the compilation and processing of trade figures.For Estonia's total trade and Latvian commodity trade, we exploit firmlevel data to estimate the distribution of exports and imports across regions, including both commodity and service trade.For Lithuania overall and Latvian service trade, regional statistics for direct export are used to proxy the regional distribution of total export.At the same time, industry-specific gross value added is exploited for industry-specific shares for total imports.To ensure comparability across countries, all trade figures are harmonized against the national statistics reported by the World Trade Organization (WTO).
We estimate incoming foreign direct investment (FDI) equity stocks for the Baltic regions by a wide range of approaches, as listed further in the notes below the table.For each country, the approach depends on the data availability.All FDI figures are harmonized with national statistics reported by the International Monetary Fund (IMF), which is extended by data from the Organization for Economic Co-operation and Development (OECD) to ensure comparability across countries.
We use firm-level data to proxy the regional distribution of FDI for Estonia and Latvia.For incoming FDI, the Estonian firm data include actual incoming FDI, making the regional distribution straightforward to estimate.For Latvia, we proxy the regional distribution by the product between a dummy for incoming FDI and the equity value.While the latter approach results naturally in less precise figures on FDI statistics, investigations of the Estonian data-which include complete ownership statistics-suggest that it is probably not a major issue.Moreover, most Estonian companies subject to FDI have relatively high foreign ownership shares, which seldom are below 50 percent.
Country-specific data collection approaches are also used for outgoing FDI, conferring at the bottom of Table 3.For outgoing FDI from Estonia, we apply the product between a dummy for outgoing FDI and the value of assets in subsidiaries as regional weights.We have the same possibility for Latvia, but here the data for the outgoing FDI dummy are considerably incomplete in terms of coverage.Accordingly, we use the value of assets in subsidiaries for all firms as a second proxy for the Latvian regional distribution of outgoing FDI.When approximating the regional distribution of the outgoing Latvian FDI, we put equal weight on the two proxies.
In the case of Lithuania, regional figures for incoming FDI are provided by Statistics Lithuania.For outgoing Lithuanian FDI, we combine industry-specific investment intensities from the OECD and the regional distribution of gross value added per industry to approximate the regional distribution.

Collection and Processing of Domestic Variables
In this subsection, we turn to the collection and processing of domestic variables.We will provide an overview of the primary processes involved in collecting and processing data, while measures to deal with particular issues with incomplete data are accounted for in Appendix B.
We start by presenting fixed capital figures.These figures are not directly accessible at the regional level from statistical sources and thus have to be estimated, as explained in detail in Table 4.For Estonia and Latvia, the regional distribution of fixed capital was approximated with the help of firm-level data from 2007 to 2019.

Fixed Capital Stock
Gross stocks at the end of each year, reported in current prices and fixed 2010 prices.Measured in Million euros from 2000 to 2019.

Estonia
The gross fixed capital stocks in the current price are estimated from 2007 to 2019 as the sum of tangible and intangible fixed capital stock based on enterprise register data.We extend the time series further backwards to 2000 based on data from Eurostat, assuming that the development in fixed capital stock corresponds to the development in the quasi-stocks of fixed capital.These quasi-stocks are estimated assuming that the fixed capital per gross value added in current prices is the same by industry across the country for each of the 11 industries in our study.Figures are harmonized with the aggregate fixed capital stock figures at Eurostat.We calculate the gross fixed capital stock in fixed prices as the ratio between the fixed capital in current prices and the aggregate fixed capital deflator.

Latvia
The gross fixed capital stocks in the current price are estimated from 2007 to 2019 as the sum of tangible and intangible fixed capital stock based on data from the State Revenue Service of the Republic of Latvia.We extend the time series further backwards to 2000 based on data from Eurostat, assuming that the development in fixed capital stock corresponds to the development in quasi-stocks of fixed capital.These quasi-stocks are estimated assuming that the fixed capital per gross value added in current prices is the same by industry across the country for each of the 11 industries in our study.Figures are harmonized with the aggregate fixed capital stock figures at Eurostat.We calculate the gross fixed capital stock in fixed prices as the ratio between the fixed capital in current prices and the aggregate fixed capital deflator.

Lithuania
We estimate the regional distribution of fixed capital stock in Lithuania by applying and then taking the average of two alternative approaches from 2007 to 2019.First, we collect figures for gross investments in fixed capital at the regional level from Statistics Lithuania with data for three years before the initial year of our time series.We convert the gross investment figures to fixed price figures with the help of the fixed capital deflator.Next, we calculate quasi-stocks for fixed capital by summing the depreciated gross investments and applying linear depreciation with a rate of 25 percent.We let the fixed capital stocks reported at Eurostat correspond to the national aggregates and exploit the quasi stocks to proxy the regional distribution of fixed capital.Second, we collect the fixed capital stocks in current prices for the 11 industries in our study from Eurostat and distribute them across regions, assuming that the fixed capital per gross value added in current prices is the same by industry across the country.In the period from 2000 to 2006, we only apply the latter-mentioned approach, as regional investment series are unavailable.Figures are harmonized with the aggregate fixed capital stock figures at Eurostat.Volume figures are obtained by dividing the fixed capital stock in current prices by the fixed capital deflator.
In the same period, we have applied equally weighted proxies for Lithuania to approximate the regional distribution.First, we have exploited Statistics Lithuania's regional series for gross investments.Second, we have combined industry-specific regional distribution of gross value with industry-specific ratios between fixed capital stock and gross value added at the national level.For all three countries, we have extrapolated the development in fixed capital back to 2000, again combining industry-specific regional distribution of gross value with industry-specific ratios between fixed capital stock and gross value added at the national level.Furthermore, the aggregate fixed capital deflator was used for deflation purposes, and the figures were harmonized with the aggregate fixed capital stock figures at Eurostat.At last, we collect the fixed capital stocks in current prices for the 11 industries in our study from Eurostat and distribute them across the regions, assuming that the fixed capital per gross value added in current prices is the same by industry across the country.
In Table 5, we account for the data collection and processing of domestic variables, starting with employment figures.As there are some small inconsistencies and missing Data 2023, 8, 181 10 of 32 data in the labor market figures in OECD's Regional Database, we have chosen to apply the figures in Eurostat's regional statistics as a basis for our employment figures.Eurostat's employment figures are disaggregated into employed and self-employed and the 11 industries listed, as accounted for in Table 1.We exploit the age and gender distributions suggested by OECD's Regional Database to obtain employment figures across age groups and genders.
Table 5.Data collection and processing of domestic variables on labor participation, price indexes, value added, information production and sharing, demographics, the population's well-being, and physical settlement features.

Labor participation
Employment mid-year reported for all ages with further split into employment relationship (i.e., employees and self-employed people) and 11 industries, 15 to 64 years old divided into genders, 15 to 24 years old, 25 to 64 years old, and more than 65 years old Overall figures, as well as figures for industry affiliation and employment relationship, are obtained from Eurostat's regional statistics.In addition, the age distribution is obtained from OECD's regional statistics Unemployment mid-year, reported for either gender or the age groups 15 to 24 years old and 25 to 64 years old, with further disaggregation into short-term and long-term unemployed people (i.e., unemployed for less and more than a year, respectively) Obtained from OECD's regional statistics Rate of young people not in employment, education or training, age 18 to 24 years old Rate of early leavers from education and training, age 18 to 24 years old

Price indexes
Gross value added in basic prices deflators with 2010 as the base year, measured for 11 industries and at the national level Industry-specific gross value added deflators are collected from the national accounts at Eurostat.Regional gross value added deflators are estimated as the ratios between current and fixed price figures at the regional level.In addition, deflated figures based on national deflators are included.

Gross value added in market prices deflator at the national level with 2010 as the base year
Derived at the aggregate level from the national accounts at Eurostat Fixed capital deflator with 2010 as the base year, reflecting the price development in gross investments End use deflators with 2010 as the base year, including deflators for total end use (sum of consumption, exports, and gross investments in inventories and fixed capital), and total and household consumption.Trade deflators for exports and imports with 2010 as the base year, measured at national and EU-27 levels

Total income deflator with 2010 as the base year
Derived at the aggregate level from the national accounts at OECD Purchasing power parity for production and consumption with EU-27 of 2020 as a basis Derived at the aggregate level from the national accounts at Eurostat.

Exchange rate between Euro and US Dollar
Obtained from Norges Bank

Value added
Gross value added measured at basic prices, reported for 11 industries in current prices and fixed 2010 prices Obtained from Eurostat's regional statistics.The fixed price figures are derived with the help of industry-specific gross value added deflators

Gross value added measured at market prices, reported in current prices
Obtained from Eurostat's regional statistics

Net product taxes, reported in current prices
Calculated as the difference between gross value added in the market and basic prices.Measured in Million euros from 2000 to 2019 Obtained from OECD's Regional Database.The broadband coverage data builds on surveys, such that the coverage seemingly, but erroneously, in some instances, could decline in the raw survey data.In case of an extreme drop in coverage for a region below the coverage in two or more preceding years, we interpolate the concerned extreme observations linearly.In case of a modest drop in coverage for a region that concerns a sequence of years, we assume that the broadband coverage corresponds to the average of the concerned observations

Demographics
Population at the beginning of year, divided into gender and five-year age groups from 0 to 4 years to 80 years and above, as well as the population between 25 and 64 years with primary and lower secondary education, higher secondary and vocational education, and tertiary education Obtained from OECD's Regional Database.In the case of Estonia, a data break occurred between 2014 and 2015, where many people were moved in the registers from one region to another.Hence, we removed this error term from the total population figures for the Estonian region in all earlier years and adjusted the underlying population figures proportionally.Note that this error is approximated in relation to the calculation of international migration figures.Natural population growth, divided into births and deaths Obtained from Eurostat's regional statistics Domestic migration, disaggregated into domestic immigration and domestic emigration Obtained from OECD's Regional Database Disposable household income measured in current prices, fixed 2010 prices, and purchasing power parities with European Union-27-2020 as a basis Obtained from OECD's Regional Database.Fixed price and purchasing power-adjusted figures are derived with the help of household consumption deflators and the EU-27 consumer purchasing power parity, respectively

Variable Data Collection and Processing
Gross national income in current prices and purchasing power parities with European Union-27-2020 as a basis National figures are obtained from Eurostat, while regional distributions are obtained from Global Data Hub's Human Development statistics.Fixed price and purchasing power-adjusted figures are derived with the help of income deflators and the EU-27 consumer purchasing power parity, respectively.Active physicians, including practicing physicians and other physicians for whom the execution of their job requires a medical education.
Obtained from OECD's Regional Database Hospital beds, including acute care beds, rehabilitative care beds, long-term care beds, and other beds in hospitals

Physical settlement features
Area measured in square kilometers, divided by land area and freshwater area Obtained from Eurostat's regional statistics.The freshwater area is calculated as the difference between the total area and the land area Main rooms in dwellings, including bedrooms and living rooms Obtained from OECD's Regional Database Degree days in terms of cooling degree days and heating degree days Obtained from Eurostat's regional statistics A notch further down on the table, we present price data.Deflators and purchasing power parities are mainly collected from Eurostat, but Norges Bank and OECD data have also been used in the process.Gross value added deflators are derived at an industryspecific level, enabling us to estimate regional gross value added deflators.Otherwise, the deflators are collected at the national level.In addition, the dataset contains deflators and purchasing power parities measured with respect to the EU-27-2020 economy and exchange rates between EUR and USD.
The collection and processing of figures for gross value added are depicted in the table below.We collect current price figures for gross value added in basic prices and market prices from Eurostat's regional statistics in current prices.The differences between these figures correspond to the net product taxes.Gross value added at basic prices is reported separately for the 11 industries.
We also collect unemployment figures and rates associated with youth's participation in education and labor markets from OECD's Regional Database.For the unemployment figures, we limit ourselves to the age group from 15 to 64 years old, as the statistics seem unstable and unreliable outside this age group, as well as relatively dependent on local registration of the place of residence.The unemployment figures can be further disaggregated into genders or a combination of age groups and duration of unemployment.
Beyond schooling, other sorts of information sharing and knowledge generation may play a significant role in economic development [2,3].Accordingly, we include latent broadband coverage for households and three sets of variables concerning property rights in our dataset.The data are gathered from Eurostat's regional statistics and OECD's Regional database.In the case of intellectual property rights, figures were reported nationally without regional belonging.We spread them across regions in line with the annual regional shares for the respective variable.
Demographic variables are listed next, including population levels and domestic population changes through birth, deaths, domestic migration, life expectancy, and particular causes of death.Population levels can be disaggregated into age groups, gender, and educational profile.The data on life expectancy are collected from Global Data Hub's Human Development Database, while other data are gathered from Eurostat's regional statistics and OECD's Regional database.
The dataset also includes variables related to the well-being of the population, as shown in Table 5.This includes the Human Development Indexes, as well as topically related variables concerning years of schooling, income, and health.In addition to OECD's Regional Database and Eurostat's regional statistics, Global Data Hub's Human Development Database constitutes the main source in this regard.
Finally, the dataset covers physical settlement features, as listed at the bottom of the table.This includes land area and main rooms in dwellings and degree days.Once again, the data are collected from OECD's Regional Database.Note that degree days include both cooling degree days and heating degree days as measures of energy needs in buildings related to temperature regulation.In contrast, the area may be disaggregated into land area and freshwater area.

User Notes
In this section, we start by elaborating on the relevance of the dataset provided in this article.In this regard, we provide a brief literature review.Moreover, knowledge of internationalization is essential for a full understanding of the economic developments of small and open economies, hereunder inter alia trade, foreign direct investments, and international migration.Regional studies are important for our understanding of internationalization, as studying at the country level may mask large heterogeneity in its relationship with economic growth and other economic and social outcomes.
Next, we utilize our regional accounts for the Baltic states to show some key descriptive statistics for the Baltic NUTS 3 regions, focusing on internationalization. Hopefully, the section can help other researchers better understand some key internationalization patterns in the spatial Baltic economy and obtain an overview of the dataset that accompanies the paper.We start by providing an overview through summary statistics and a piecewise correlation matrix.Then, we turn to more details on internationalization, including trade, FDI, and international migration.Towards the end of the section, we address settlement, production, and intellectual property rights and how these covary with internationalization.
In the graphic representations, we focus on 2019 and the development since 2008, but we exceptionally investigate shorter time spans when the available data series are shorter.For stock variables reported at the beginning or the end of each year (e.g., FDI, fixed capital stock, and population), we calculate their annual averages in our presentation of the descriptive statistics.Further, note that all monetary variables in our presentation in this section are reported in current prices when only dealing with levels and in fixed 2010 prices otherwise.The sole exception occurs when we explicitly mention purchasing power parity, using EU-27 as a basis for measurement.For FDI, we use domestic GDP deflators for deflation, while national deflators are used to deflate exports and imports.

Relevance of the Dataset
The economic and geographic strands of the literature have highlighted the critical role of internationalization factors in economic development, such as foreign direct investments [2][3][4][5][6], trade [7,8] and international migration [9,10].These internationalization measures are typically available exclusively at the country level with limited availability at the regional level [1].Significant within-country regional disparities in economic development are, however, well documented, especially within the European Union [11,12], including the Baltics [13].Regional breakdown for major national aggregates, such as gross value added and employment in regional accounts, rarely incorporate internationalization variables.
Studying internationalization at the national level may obscure the significant variation in its relationship with economic growth and other economic and social outcomes, calling for more disaggregated data.This paper introduces an original dataset providing detailed regional accounts for the Baltic countries-Estonia, Latvia, and Lithuania-from 1990 to 2020 at the NUTS 3 regional level, including internationalization indicators for the 2007 to 2019 period.This exercise has two objectives.First, we aim to stimulate research on internationalization and spatial economy in the Baltics by providing a large collection of socio-economic variables at the regional level, including trade, foreign direct investments (FDI), and migration.The construction of these variables relies on a joint effort to access administrative firm-level data in all three countries, allowing cross-country data processing in a harmonized framework.
Second, we seek to illustrate how regional account data may be collected and processed to reconstruct consistent and homogenous regional-level series.This is particularly relevant for internationalization, as it constitutes an essential aspect of regional economies but is rarely captured by the national statistical bureaus' regional accounts.
In some countries, the national statistical bureaus publish regional accounts, which give an economic overview of the domestic regions.The methodology for processing regional accounts is essentially the same as for national accounts [14,15], but aspects of internationalization tend to receive limited attention.For instance, internationalization variables are typically omitted from regional accounts in Scandinavian countries and Finland.Moreover, the Baltic statistical bureaus do not offer detailed, immediately available regional accounts, albeit some data on migration are available.Yet, regional statistics can be reconstructed from various international sources, primarily Eurostat, OECD, and Global Data Hub.To generate internationalization data at the regional level, we start by collecting country-level data from these international data providers.We then combine them with firm-level data from the three Baltic countries' statistical bureaus.These firm-level datasets, which are not publicly available, contain information such as exports and imports as well as firms' geographical location.These micro-level data allow us to obtain aggregates at the regional level.However, they usually do not cover a hundred percent of the firms operating in a given country, at least in the case of service trade, motivating the triangulation with official national aggregate statistics.
A demonstration of how to prepare regional account variables, in particular internationalization variables, may be of interest to geographers and economists alike.This paper focuses on the Baltic States, which is an excellent case for this purpose.First, internationalization has been identified as an essential driver of the economic development of the Baltic states, which have flourished to become advanced economies since the collapse of the Soviet Union.Second, despite the relatively small size, large within-country convergence issues have been identified [16,17].Third, the scientific literature has addressed several features of internationalization's role in the Baltic states' economic development, including migration, especially emigration and return migration (e.g., [18,19]), but more recently also the inward migration from third countries [20], inward FDI patterns [18][19][20][21], outward FDI [22][23][24][25][26], and trade [27,28].
Some studies indicate that urban Baltic regions have had more considerable economic development than rural regions [29][30][31].Nevertheless, most studies do not address internationalization at the regional level, leaving the spatial heterogeneity of internationalization somewhat less understood.
In addition to a detailed description of our approach, we perform a descriptive analysis providing new insights into the internationalization in the Baltics.We document several statistical patterns visible when studying the regional accounts, encouraging other researchers to exploit our dataset for further investigation.Our descriptive analyses can be considered as an initial investigation of the key relationships between key conditions in Baltic society.Further investigations may be carried out in a wide range of directions, for instance, examining causal relations with conventional panel data econometrics and economic growth models or testing for spatial clustering and autocorrelation [32][33][34].

Summary Statistics
Our Baltic regional accounts include a wide range of variables.Summary statistics for some selected key variables are given in Table 6 below.In addition to internationalization, these capture the production economy, settlement patterns, and knowledge generation.Piecewise correlations between the selected key variables are listed in Table 7 (with some abbreviations compared to the table above).Evidently, there are relatively strong correlations between internationalization variables and many of the other variables, particularly in the case of trade and FDI.Yet, correlation patterns should not necessarily be interpreted as causal impacts from internationalization on the outcome variables in question, as correlation patterns may also be caused by cofounders, reverse causality, or coincidence.

Internationalization
We now take a closer look at the internationalization variables in our dataset, including trade (i.e., export and imports), FDI in equity stocks (both outward and inward), and international migration (both emigration and immigration).
In Figure 2a-f, we show the status for each internalization variable in each Baltic NUTS 3 regions in 2019, as well as the annual growth rates since 2007.In all single bar figures, Estonian, Latvian, and Lithuanian regions are represented by blue, red, and green bars, respectively.To make the figures comparable, we normalize them to be in per capita or employed terms.
Overall, there is a tendency for urban regions to be more internationalized than rural regions, in line with the literature on urban and international economics [35][36][37][38].Furthermore, the rural Latvian regions (i.e., regions other than Riga and Pieriga-the region surrounding the Latvian capital city) seem to be less internationalized than Estonian and Lithuanian regions.Still, there is much heterogeneity associated with this picture.
In the case of exports, the disaggregation at the regional level uncovers a high level of heterogeneity across regions.Telšiai is the largest exporting region in the Baltics and can be considered an outlier, with relatively high levels per employed person compared to all other regions.This is mainly due to the oil refinery operating in Mazeikiai and its suppliers.Yet, other parts of the business sector contribute considerably as well, for instance, the seafood producer Viči ūnai Group, the dairy producer Žemaitijos Pienas, and a wide range of small and medium-sized firms within car parts, furniture, or metal processing.
Although exporting regions, in some cases, contribute to a considerable share of the export's value added content, they often function as transit regions for international trade.
The Lithuanian port region Klaip ėda holds the second highest export intensity, followed by the Estonian capital region, Pöhja-Eesti, despite relatively few commodity industries in this region.Estonian regions are more homogeneous than those in the other two countries.In Latvia, Riga and Pieriga are above the average, but the three other regions are at the bottom of the ranking.In particular, Latgale is far below any other region.In terms of export growth over time, note that all regions progressed.Export growth in Riga is the slowest, in sharp contrast with Pieriga, the region with the largest growth.This is consistent with the rapid economic development of the municipalities in the vicinity of Riga.Regarding imports, Põhja-Eesti has the largest intensity, far above other Estonian regions.This concentration is even more pronounced in Latvia, where the vast majority of import activities come from the capital region and its neighborhood (i.e., Riga and Pieriga), with the four other Latvian regions lagging behind all the other Baltic regions.In contrast, Lithuanian regions show much more homogeneity.
For outward FDI, Põhja-Eesti is far above any other regions.A very large number of outward FDI in this region is expected, given that Estonia has been one of the most active countries in terms of outward investments among the CEE countries [26].However, the growth over time is much smaller than in other Estonian regions.In Latvia and Lithuania, capital regions are also the largest players in terms of outward FDI.Beyond these three regions, with perhaps the exception of Kaunas, outward FDI is scarce or even inexistent in rural areas such as Latgale.Last, note that the annual growth rate is of the same magnitude across Lithuanian regions, contrary to the neighboring countries.
The situation is quite similar for inward FDI.In all three countries, aggregate inward FDI masks very large regional disparities, with capital regions concentrating the majority of investments.Again, Põhja-Eesti is the largest FDI recipient, but the change over time is rather small.The lowest levels are found in Latgale and the rural regions of Lithuania, except for the exporting region of Telšiai.
The last two panels of Figure 2 display international migration inflows and outflows, respectively.Regarding emigration, Põhja-Eesti is in Estonia largely ahead of other Estonian regions.In Latvia, the situation is similar.Riga and Pieriga have seen large emigration per capita, much larger than in other Latvian regions.On the other hand, emigration in Lithuania appears more homogenous throughout the territory.Regarding the annual growth rate, note that emigration in the Baltics almost doubled in the aftermath of the 2008 financial crisis [37].
Finally, the same spatial pattern is visible for immigration as well, being spatially balanced in Lithuania and concentrated in capital regions in both Estonia and Latvia.Comparing the countries, the immigration levels are highest in Lithuania and lowest in Latvia.Šiauliai is subject to the highest immigration, followed by Põhja-Eesti and Klaip ėda.

Production Economy
As briefly discussed in our introduction, high economic performance is also often associated with a high degree of internationalization.We now present the essence of the Baltic regional economy and briefly address its internationalization linkages.
Status and activity levels in the Baltic regions in terms of gross value added and total employment are reported in Figure 3.It is worth noting that Põhja-Eesti (Northern Estonia), which includes the capital city of Tallinn and the surrounding Harju county, plays a more dominating role in Estonia than Riga does in Latvia, and Vilnius does in Lithuania.This is partly due to how the regions are delimited; for instance, much of Riga's suburbia is encompassed by the surrounding region of Pieriga.Also, notice that Latvia has relatively large NUTS 3 regions compared to its neighbors in terms of population.
One may note that urban regions generally demonstrate higher growth rates than rural regions, linked to the concentration of economic activities in capital regions.An exception with regard to value added growth in fixed prices is Põhja-Eesti.As we have measured annual growth since 2008, this can be seen in relation to the global financial crisis that hit the Estonian capital region relatively hard.In this region, employment displays a higher growth rate than the gross value added in fixed prices.A key to explaining these negative annual growth rates may lie in the choice of the reference year, 2008, which is the year of the global financial crisis, which heavily impacted the Baltic States.In the aftermath of the crisis, there has also been some reshuffling in the allocation of industry activities between the capital regions and their neighboring regions.
To better understand the differences between the gross value added figures and employment figures, Figure 4 displays labor productivity, capital intensity, and industry composition figures.In addition, we report employment shares.
employment are reported in Figure 3.It is worth noting that Põhja-Eesti (Northern Estonia), which includes the capital city of Tallinn and the surrounding Harju county, plays a more dominating role in Estonia than Riga does in Latvia, and Vilnius does in Lithuania.This is partly due to how the regions are delimited; for instance, much of Riga's suburbia is encompassed by the surrounding region of Pieriga.Also, notice that Latvia has relatively large NUTS 3 regions compared to its neighbors in terms of population.One may note that urban regions generally demonstrate higher growth rates than rural regions, linked to the concentration of economic activities in capital regions.An exception with regard to value added growth in fixed prices is Põhja-Eesti.As we have measured annual growth since 2008, this can be seen in relation to the global financial crisis that hit the Estonian capital region relatively hard.In this region, employment displays a higher growth rate than the gross value added in fixed prices.A key to explaining these negative annual growth rates may lie in the choice of the reference year, 2008, which is the year of the global financial crisis, which heavily impacted the Baltic States.In the aftermath of the crisis, there has also been some reshuffling in the allocation of industry activities between the capital regions and their neighboring regions.
To better understand the differences between the gross value added figures and employment figures, Figure 4 displays labor productivity, capital intensity, and industry composition figures.In addition, we report employment shares.First, Latvia shows substantial internal disparities for labor productivity and capital intensity: Riga and Pieriga are among the top Baltic regions in these dimensions, but Latgale is the region with the lowest labor productivity and capital intensity.The Estonian regions also show very diverse outcomes, whereas Lithuanian regions are a bit more homogenous.High labor productivity may not only be a symptom of high total factor productivity but is also caused by high capital intensity.There is indeed such a positive correlation; regions with high capital intensity tend to have higher labor productivity.Furthermore, Riga and Telšiai have a negative annual growth rate of labor productivity, while the annual growth rate in Vilnius is just positive.Yet, these regions also have had the lowest annual growth rates for fixed capital intensity.The negative value added growth rate for Riga can, in principle, be explained by the relocation of some of the high value added economic activities to Pieriga where we can see positive growth rates.
Another potential explanation for this substantial variation in labor productivity and fixed capital intensity is differences in industrial and labor force structures.Figure 4c displays for each region the shares of gross value added per industry, while Figure 4d shows the share of unemployed and self-employed people.
Furthermore, we see that the frontier exporting region of Telšiai also holds the highest manufacturing share in the Baltics.In Estonia, the large share of the public sector in Põhja-Eesti (Northern Estonia) is logical for the capital region and Lõuna-Eesti (Southern Estonia), the latter region including Tartu, the cultural and educational center of Estonia.On the other hand, Latgale, the Latvian region with the lowest degree of internationalization, also has a high share of employment in the public sector, highlighting the economic difficulties of this region.
Rural areas both tend to have relatively high unemployment and an industry structure, where industries with relatively high self-employment are common.Consequently, the figures tend to show that unemployment is relatively high in rural regions, where selfemployed people represent a rather large portion of the labor force.Both the unemployment and self-employment rates are somewhat lower in Estonia than in Latvia and Lithuania.
Figure 5 depicts some selected correlation patterns between the production economy and internationalization. On the left part, the upper and lower graphs show the link between gross value added and export and import intensity, respectively.Regarding the latter, the most noticeable fact is the split in Estonia between Põhja-Eesti (visible in the top right corner) and the rest of the country.This split is much more salient than in the two other countries.First, Latvia shows substantial internal disparities for labor productivity and capital intensity: Riga and Pieriga are among the top Baltic regions in these dimensions, but Latgale is the region with the lowest labor productivity and capital intensity.The Estonian regions also show very diverse outcomes, whereas Lithuanian regions are a bit more homogenous.High labor productivity may not only be a symptom of high total factor productivity but is also caused by high capital intensity.There is indeed such a positive correlation; regions with high capital intensity tend to have higher labor productivity.Furthermore, Riga and Telšiai have a negative annual growth rate of labor productivity, while the annual growth rate in Vilnius is just positive.Yet, these regions also have had the lowest annual growth rates for fixed capital intensity.The negative value added growth rate for Riga can, in principle, be explained by the relocation of some of the high value added economic activities to Pieriga where we can see positive growth rates.
Another potential explanation for this substantial variation in labor productivity and fixed capital intensity is differences in industrial and labor force structures.In the two central graphs of Figure 5, we illustrate how fixed capital intensity correlates with trade.Here again, focusing on country-level data would hide stark within-country patterns.The correlation between capital intensity and trade is positive overall within the country but to different extents.The strong positive correlation between fixed capital and inward FDI stock highlights the role of FDI as an essential source of capital investments.
Finally, we also provide in the right panel of Figure 5a a description of the relationship between the unemployment rate and migration data, where within-country variations are visible.Unemployment being a solid push factor of emigration, the correlation between the unemployment rate and international emigration is surprisingly weak in Latvia and Lithuania and even negative in Estonia.Nonetheless, these correlations do not imply causation.In the absence of emigration, one might think the unemployment rates would have probably been higher.
Figure 5 depicts some selected correlation patterns between the production economy and internationalization. On the left part, the upper and lower graphs show the link between gross value added and export and import intensity, respectively.Regarding the latter, the most noticeable fact is the split in Estonia between Põhja-Eesti (visible in the top right corner) and the rest of the country.This split is much more salient than in the two other countries.Regarding exports per employed people, Lithuania stands out.Whereas the correlation is positive and linear for Latvia and Estonia, the correlation appears low in Lithuania.

Settlement
We now move on to key features of settlement in the Baltic regions.In Figure 6, the upper panel displays the 2019 Human Development Index (HDI) for each region.The region with the highest HDI in each country is the capital region.Note that the relatively high education score in Latvia compared to the neighboring countries is due to the fact the education system includes one extra year in Latvia.
Furthermore, the component with the largest within-country heterogeneity for all countries is income, with a very sharp contrast between capital regions and more rural regions, largely following the differences in population density.These huge income disparities are also clearly visible in the middle panel of Figure 6, representing gross income per capita.The three capital regions have nearly identical income per capita, but the poorest region in Latvia has a purchasing power parity income more than 20 percent smaller than the poorest region in Lithuania.
Next, we provide additional information on the population age structure, as one might think that the many regional differences observed above may be due to structural discrepancies.However, this does not seem to be the case, as no clear difference across regions is visible.Life expectancy is, however, much higher in Estonia than in the two other countries.Finally, the bottom panel of Figure 6 offers some insight into regional population dynamics.In all regions but Põhja-Eesti, the number of births is smaller than the number of deaths.
On the other hand, internal migration shows an interesting pattern-capital regions gain population in all three countries, mainly from the most rural areas.In Latvia, Riga itself has a negative inner migration balance, but it is mainly positive for the municipalities correlations do not imply causation.In the absence of emigration, one might think the unemployment rates would have probably been higher.

Settlement
We now move on to key features of settlement in the Baltic regions.In Figure 6, the upper panel displays the 2019 Human Development Index (HDI) for each region.The region with the highest HDI in each country is the capital region.Note that the relatively high education score in Latvia compared to the neighboring countries is due to the fact the education system includes one extra year in Latvia.Furthermore, the component with the largest within-country heterogeneity for all countries is income, with a very sharp contrast between capital regions and more rural regions, largely following the differences in population density.These huge income disparities are also clearly visible in the middle panel of Figure 6, representing gross income per capita.The three capital regions have nearly identical income per capita, but the poorest region in Latvia has a purchasing power parity income more than 20 percent smaller than the poorest region in Lithuania.
Next, we provide additional information on the population age structure, as one might think that the many regional differences observed above may be due to structural discrepancies.However, this does not seem to be the case, as no clear difference across regions is visible.Life expectancy is, however, much higher in Estonia than in the two other countries.Finally, the bottom panel of Figure 6 offers some insight into regional population dynamics.In all regions but Põhja-Eesti, the number of births is smaller than the number of deaths.
On the other hand, internal migration shows an interesting pattern-capital regions gain population in all three countries, mainly from the most rural areas.In Latvia, Riga itself has a negative inner migration balance, but it is mainly positive for the municipalities in the vicinity of the capital, captured by Pieriga.Domestic migration drives population change in Lithuania, As briefly discussed in the introduction, high productivity is also often associated with internationalization. Figure 7 shows that HDI is positively related to export and import volumes for all three Baltic states.Recalling that HDI includes a measure of how knowledgeable the workforce is, the positive correlation only supports the recent findings that 'having a more educated workforce exerts a positive impact on the export intensity of firms in transition economies' [38].The strong correlation between HDI and imports per employed person is due to HDI serving as a measure of human capital, and imports per employed person being linked to per capita consumption levels.
In the central panel of Figure 7, the relationship between income per capita and both inward and outward FDI appears rather weak.Most notably, a group of Estonian data points stand apart from the others for both.Yet, it is surprisingly hard to see any pattern emerging.
Finally, regarding the correlation between international and domestic emigration per capita, Latvia shows a clear negative relationship: the larger the international emigration, the lower the domestic emigration.Such a pattern is not visible in Lithuania or Estonia.
Regarding immigration, the situation is very different.There is no clear relationship between domestic and international immigration per capita in Estonia or Latvia, and there is a clear positive correlation in Lithuania.
negative inner migration balance, but it is mainly positive for the municipalities in the vicinity of the capital, captured by Pieriga.Domestic migration drives population change in Lithuania, while it has a more modest role in Estonia and the rural regions of Latvia.
As briefly discussed in the introduction, high productivity is also often associated with internationalization. Figure 7 shows that HDI is positively related to export and import volumes for all three Baltic states.Recalling that HDI includes a measure of how knowledgeable the workforce is, the positive correlation only supports the recent findings that 'having a more educated workforce exerts a positive impact on the export intensity of firms in transition economies' [38].The strong correlation between HDI and imports per employed person is due to HDI serving as a measure of human capital, and imports per employed person being linked to per capita consumption levels.In the panel of Figure 7, the relationship between income capita and both inward and outward FDI rather weak.Most notably, group of Estonian data points stand apart from the others for both.Yet, it is surprisingly hard to see any pattern emerging.
Finally, regarding the correlation between international and domestic emigration per capita, Latvia shows a clear negative relationship: the larger the international emigration, the lower the domestic emigration.Such a pattern is not visible in Lithuania or Estonia.Regarding immigration, the situation is very different.There is no clear relationship between domestic and international immigration per capita in Estonia or Latvia, and there is a clear positive correlation in Lithuania.

Knowledge Generation
The last group of variables in our dataset concerns knowledge generation, including educational profile, intellectual property rights, and latent broadband coverage for households.In Figure 8, we illustrate these variables for the Baltic regions.Note that some variables are reported for earlier years than 2019, as we do not have regional time series up to 2019 available for all variables. (a)

Knowledge Generation
The last group of variables in our dataset concerns knowledge generation, including educational profile, intellectual property rights, and latent broadband coverage for households.In Figure 8, we illustrate these variables for the Baltic regions.Note that some variables are reported for earlier years than 2019, as we do not have regional time series up to 2019 available for all variables.
Regarding the main patterns observed in the data, years of schooling are the highest in the capital regions and the largest cities.The same applies to the share of tertiary education.Concerning applications for intellectual property rights, a similar concentration in the capital regions can be noted.
In the case of Estonia, intellectual property rights are also concentrated in Lõuna-Eesti (Sothern Estonia), which includes the country's educational center and second city, Tartu, and many of the high-tech companies.Moreover, Estonian regions demonstrate relatively high numbers.Across most Latvian and Lithuanian regions, similarly low levels can be seen.The latter indicates, among other things, that applications for intellectual property rights play a limited role in capturing the knowledge creation and diffusion activities in catching-up countries like the Baltic states.

Knowledge Generation
The last group of variables in our dataset concerns knowledge generation, including educational profile, intellectual property rights, and latent broadband coverage for households.In Figure 8, we illustrate these variables for the Baltic regions.Note that some variables are reported for earlier years than 2019, as we do not have regional time series up to 2019 available for all variables.Regarding the main patterns observed in the data, years of schooling are the highest in the capital regions and the largest cities.The same applies to the share of tertiary education.Concerning applications for intellectual property rights, a similar concentration in the capital regions can be noted.
In the case of Estonia, intellectual property rights are also concentrated in Lõuna-Eesti (Sothern Estonia), which includes the country's educational center and second city, Tartu, and many of the high-tech companies.Moreover, Estonian regions demonstrate relatively high numbers.Across most Latvian and Lithuanian regions, similarly low levels can be seen.The latter indicates, among other things, that applications for intellectual property rights play a limited role in capturing the knowledge creation and diffusion activities in catching-up countries like the Baltic states.
Finally, concerning the last indicator, broadband coverage, all regions show more or less uniformly high numbers, ranging from nearly 75 percent to 95 nearly percent.The widest coverages are found in the Estonian regions, accompanied by urban regions of Kaunas, Riga and Vilnius, as well as Zemgale.Moreover, the Latvian regions tend to have wider coverage than the Lithuanian regions.There is more cross-region variation in the growth rates.Moreover, convergence can be seen in many regions, as regions with lower levels tend to demonstrate higher growth rates.
In Figure 9, we consider how the internationalization variables correlate with the Finally, concerning the last indicator, broadband coverage, all regions show more or less uniformly high numbers, ranging from nearly 75 percent to 95 nearly percent.The widest coverages are found in the Estonian regions, accompanied by urban regions of Kaunas, Riga and Vilnius, as well as Zemgale.Moreover, the Latvian regions tend to have wider coverage than the Lithuanian regions.There is more cross-region variation in the growth rates.Moreover, convergence can be seen in many regions, as regions with lower levels tend to demonstrate higher growth rates.
In Figure 9, we consider how the internationalization variables correlate with the share of higher education among the adult population in working age, intellectual property rights, and years of schooling.The higher education shares are strongly associated with exports and imports (though some Lithuanian observations are not in line with that); such a correlation cannot be seen when looking at trade and average years of schooling, indicating especially the importance of tertiary education for internationalization.For intellectual property rights, we designed a composite index from 2007 to 2015, where applications for patents by fractional count after workplace, community designs, and European Union trademarks are weighted equally.
For each subindex, all observations are measured relative to the observation with the highest value.Despite the reservations mentioned above regarding the particular measurers of knowledge creation, the composite index for intellectual property rights being strongly positively correlated with inward FDI stock indicates its importance for capturing the endowments important for attracting FDI.To a lesser extent, some correlation can also be seen in the case of outward FDI.Still, the much higher outward investments by Estonia (especially Põhja-Eesti) somewhat blur the picture.Naturally, the picture is affected by how the foreign-owned and domestic companies compare in terms of innovation and knowledge-creation activities (e.g., [23] for Estonia).For intellectual property rights, we designed a composite index from 2007 to 2015, where applications for patents by fractional count after workplace, community designs, and European Union trademarks are weighted equally.
For each subindex, all observations are measured relative to the observation with the highest value.Despite the reservations mentioned above regarding the particular measurers of knowledge creation, the composite index for intellectual property rights being strongly positively correlated with inward FDI stock indicates its importance for capturing the endowments important for attracting FDI.To a lesser extent, some correlation can also be seen in the case of outward FDI.Still, the much higher outward investments by Estonia (especially Põhja-Eesti) somewhat blur the picture.Naturally, the picture is affected by how the foreign-owned and domestic companies compare in terms of innovation and knowledge-creation activities (e.g., [23] for Estonia).

Appendix A.2 Data Cleaning
Beyond missing variables, missing or inaccurate observations often constitute a problem.Possible handling of common issues of this sort is reviewed in the following.For all the presented issues, the missing observations should also be adjusted, such that the sum of the regional variable estimates equals the national average or the average of the regional aggregate with available data.To simplify our presentation, we will ignore this in our treatment here.Moreover, the presented formula will only hold for the set of observations that are unreliable or missing.
In some cases, the regional data include a residual that is not ascribed to any region.A possible way to handle this is to distribute the residual across the regions.Let x r,t be an observation of variable X r,t , while ∼ x r,t is the corresponding variable in the data source for region r at time t.The residual that is not ascribed to any region at time t is denoted ∼ x 0,t .We denote the variable that proxies the distribution of x r,t at time t in region r for z r,t with a national total of z t .Alternatively, it could represent data for an aggregate of regions, for which observations are available.Formally, the observations of the variable X r,t , which are ascribed a share of the residual, becomes: x r,t = ∼ x r,t + ∼ x 0,t z r,t z t (A3) Note that for some variables, it could be reasonable to adjust the covariates that are used to approximate the regional distribution, for instance, by taking differences in population growth or developments in industry structure into account.
For other variables, there may be a time break in the time series, such as the variable seemingly jumping from one year to the next due to different measurement approaches.In such cases, a possible way to handle the time break is to approximate the relative or absolute difference between the two data series.Let t * denote the time of the time break.We then have: x r,t = ∼ x r,t − ∼ x 0,t z r,t * z t * (A4) In other cases, the time break may be best handled by adjusting the scaling of the observations in the part of the time series, where the observation suffers from noise from the time break.In such instances, the following formula may be applied: In some instances, regional observations in the middle of the data series are obviously unreliable or even missing.When the unreliable or missing observations are in the middle of the data series, they may be interpolated.Let t L and t H represent the time for the closest reliable earlier and later observations.In cases where the variable is assumed to follow a steady growth path and no accurate covariates to proxy the development are available, a geometric formula may be utilized: x r,t = x r,t H x r,t L t−t L t H −t L (A6) The outcome variable may also be used as a proxy for x r,t in further estimations.Another approach would be to adjust x r,t H and x r,t L for developments in covariates.A possible arithmetic formulation in this regard is:  A minor share of the gross value added in Latvian public services is not distributed across regions.We distribute this gross value added according to the already reported regional shares for value added.

Estonia 2004
The same regional distribution as in 2005 is assumed, adjusted for regional population growth.

2018-2019
The same regional distribution as in 2017 is assumed, adjusted for regional population growth.

Central and
Western Lithuania

2004-2013
The same regional distribution as in 2014 is assumed, adjusted for regional population growth.

Central and
Western Lithuania 2016 The same regional distribution as in 2015 is assumed, adjusted for regional population growth in the age group from 18 to 24 years old.

Rate of early leavers from education or training
Latvia

2002-2005
The same regional distribution as in 2006 is assumed, adjusted for regional population growth in the age group from 18 to 24 years old.

Latgale and Vidzeme 2014
Only the joint rate is available for these regions in 2014.We interpolate the ratio between the rates linearly based on the observations in 2013 and 2015, adjusted for regional population growth in the age group from 18 to 24 years old.Assumes the same regional distribution in education shares for all regions, adjusted for development at the national level.

Latvia 2000-2001
Assumes the same regional distribution in education shares for all regions, adjusted for development at the national level.

Domestic migration
Latvia 2000 to 2010 The regional ratios between domestic emigration and domestic immigration for these years are assumed to be the same as in 2011.When transportation mortality is only reported at the national level, we assume that the regional change is the same at the national level.

Intentional homicide
Estonia 2005 The same regional distribution as in 2006 is assumed.
Latvia 2011 The regional distribution is interpolated based on 2010 and 2012.

Panev ėžys 2005-2017
Assumed to be of the same magnitude per inhabitant as in the rest of the country.

Active physicians
Panev ėžys 2017 Assumed to have the same growth in active physicians per inhabitant as the rest of the country.

Hospital beds
Central and Western Lithuania

2005-2006
All subregions are assumed to have the same growth in hospital beds as Central and Western Lithuania.

Employment
Employment, employed and self-employed, trade, transportation and tourism, and information and communication Lääne-Eesti, Kesk-Eesti and Kirde-Eesti

2000-2018
For these regions and industries, Eurostat reports combined employment per either industry or region.To obtain regional industry figures, we assume proportional splits. Latvia

2000-2006
For these industries, Eurostat reports combined gross value added per either industry or region.
As we have regional variables per industry in 2005, we stipulate the figures backward by assuming proportional regional and industry-specific growth rates.
Employment, employed and self-employed, finance, real estate, and business services Employment, employed and self-employed, public sector and personal services

2000-2006
Employment, employed and self-employed, public sector 2007-2019 A minor share of the employment in Latvian public services is not distributed across regions.We distribute this employment added according to the already reported regional employment shares.
Employment, 15 to 24 and 25 to 64 years old Panevežys 1998-1999 The same labor force development as in the rest of the country is assumed, adjusted for regional population growth in the respective age group.

Figure 1 .
Figure 1.Map of the Baltic NUTS 3 regions.Figure 1. Map of the Baltic NUTS 3 regions.

Figure 1 .
Figure 1.Map of the Baltic NUTS 3 regions.Figure 1. Map of the Baltic NUTS 3 regions.
Life expectancy at birth for new-bornObtained from Global Data Hub's Human Development Database Infant mortality, measured as deaths in the first year of life Obtained from OECD's Regional Database Transportation-related mortality, measured as deaths in traffic Intentional homicides in terms of unlawful homicides purposely inflictedThe population's well-beingHuman development index, divided into subindexes for education, health, and incomeGlobal Data Hub's Human Development Database Years of schooling in mean for completed education and in expectation upon entering education.

Figure 2 .
Figure 2. From the top to bottom-(a) exports per capita, (b) imports per capita, (c) outward FDI per capita, (d) inward FDI per capita, (e) international emigration per employment, and (f) international immigration per employment across Baltic NUTS-3 regions in 2019 with annual growth from 2008.

Figure 2 .
Figure 2. From the top to bottom-(a) exports per capita, (b) imports per capita, (c) outward FDI per capita, (d) inward FDI per capita, (e) international emigration per employment, and (f) international immigration per employment across Baltic NUTS-3 regions in 2019 with annual growth from 2008.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.

Figure 3 .
Figure 3. From top to bottom-(a) gross value added and (b) total employment across Baltic NUTS-3 regions in 2019 with annual growth from 2008 in fixed prices in cases of gross value added.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.

Figure 3 .
Figure 3. From top to bottom-(a) gross value added and (b) total employment across Baltic NUTS-3 regions in 2019 with annual growth from 2008 in fixed prices in cases of gross value added.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.
employed people, Lithuania stands out.Whereas the correlation is positive and linear for Latvia and Estonia, the correlation appears low in Lithuania.Data 2023, 8, x FOR PEER REVIEW 20

Figure 4 .
Figure 4. From top to bottom-(a) labor productivity and (b) capital intensity in 2019 with annual growth rates for fixed prices figures since 2008, (c) industry composition in terms of gross value added shares, and (d) unemployment rate of the labor force and self-employment rate of employed people in 2019 across the Baltic NUTS -3 regions.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.

Figure 4 .
Figure 4. From top to bottom-(a) labor productivity and (b) capital intensity in 2019 with annual growth rates for fixed prices figures since 2008, (c) industry composition in terms of gross value added shares, and (d) unemployment rate of the labor force and self-employment rate of employed people in 2019 across the Baltic NUTS -3 regions.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.

Figure 5 .
Figure 5. Correlation plots across Baltic NUTS 3 regions-between labor productivity and (a) exports per employed people (top left) and (b) imports per employed people (bottom left)-between capital intensity and (c) outward FDI (top middle) and (d) inward FDI (bottom middle)-and between the unemployment rate and (e) international emigration per capita (top right) and (f) international immigration per capita (bottom right).Color codes in the plot diagram: Blue is Estonia, red is Latvia, and green is Lithuania.

Figure 5 .
Figure 5. Correlation plots across Baltic NUTS 3 regions-between labor productivity and (a) exports per employed people (top left) and (b) imports per employed people (bottom left)-between capital intensity and (c) outward FDI (top middle) and (d) inward FDI (bottom middle)-and between the unemployment rate and (e) international emigration per capita (top right) and (f) international immigration per capita (bottom right).Color codes in the plot diagram: Blue is Estonia, red is Latvia, and green is Lithuania.
of the capital, captured by Pieriga.Domestic migration drives population change in Lithuania, while it has a more modest role in Estonia and the rural regions of Latvia.

Figure 6 .
Figure 6.From top to bottom-(a) Human Development Indexes, (b) gross national income and land area per capita, (c) population composition and life expectancy, (d) domestic population changes across Baltic NUTS 3 regions in 2019.

Figure 6 .
Figure 6.From top to bottom-(a) Human Development Indexes, (b) gross national income and land area per capita, (c) population composition and life expectancy, (d) domestic population changes across Baltic NUTS 3 regions in 2019.

Figure 7 .
Figure 7. Correlation plots across Baltic NUTS 3 regions-between the Human Development Index and (a) exports per employed people (top left) and (b) imports per employed people (bottom left)between gross national income in current PPP prices and (c) outward FDI (top middle) and (d) inward FDI middle)-and between international and domestic emigration per capita (top right) and immigration per capita (bottom Color codes in the plot diagram: Blue is Estonia, red is Latvia, and green is Lithuania.

Figure 7 .
Figure 7. Correlation plots across Baltic NUTS 3 regions-between the Human Development Index and (a) exports per employed people (top left) and (b) imports per employed people (bottom left)between gross national income in current PPP prices and (c) outward FDI (top middle) and (d) inward FDI (bottom middle)-and between international and domestic (e) emigration per capita (top right) and (f) immigration per capita (bottom right).Color codes in the plot diagram: Blue is Estonia, red is Latvia, and green is Lithuania.

Figure 8 .
Figure 8. From top to bottom-(a) years of schooling in mean and expectation in 2019, (b) education profile for the grown-up population of working age in 2019, (c) intellectual property rights in 2019, (d) broadband coverage in 2019 with annual growth rates from 2008 across Baltic NUTS 3 regions in 2019 with annual growth from 2008.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.

Figure 8 .
Figure 8. From top to bottom-(a) years of schooling in mean and expectation in 2019, (b) education profile for the grown-up population of working age in 2019, (c) intellectual property rights in 2019, (d) broadband coverage in 2019 with annual growth rates from 2008 across Baltic NUTS 3 regions in 2019 with annual growth from 2008.Color codes in the single bar charts: Blue is Estonia, red is Latvia, and green is Lithuania.The purple triangle indicates annual growth.

Figure 9 .
Figure 9. Correlation plots across Baltic NUTS 3 regions-between the share of the adult population of working age with higher education and (a) exports per employed people (top left) and (b) imports per employed people (bottom left)-between the composite index for intellectual property rights and (c) outward FDI (top middle) and (d) inward FDI (bottom middle)-and between average years of schooling (e), international emigration per capita (top right), and (f) international immigration per capita (bottom right).Color codes in the plot diagram: Blue is Estonia, red is Latvia, and green is Lithuania.

Figure 9 .
Figure 9. Correlation plots across Baltic NUTS 3 regions-between the share of the adult population of working age with higher education and (a) exports per employed people (top left) and (b) imports per employed people (bottom left)-between the composite index for intellectual property rights and (c) outward FDI (top middle) and (d) inward FDI (bottom middle)-and between average years of schooling (e), international emigration per capita (top right), and (f) international immigration per capita (bottom right).Color codes in the plot diagram: Blue is Estonia, red is Latvia, and green is Lithuania.

Table 3 .
Data collection and processing of global migration, international trade, and foreign direct investment figures over countries.

Table 4 .
Data collection and processing of fixed capital stock.

Table 6 .
Summary statistics for selected key variables in the Baltic regional accounts.

Table 7 .
Piecewise correlation matrix for selected key variables in the Baltic regional accounts.All economic key figures are measured in fixed prices.
t = (t H − t) (t H − t L )

Table A1 .
Data gaps at the regional level for economic, education, and technology figures and corresponding special adaptations.Eurostat report combined gross value added per either industry or region.As we have regional variables per industry in 2005, we stipulate the figures backward by assuming proportional regional and industry-specific growth rates.

Table A1 .
Cont.Populations per age group within the age groups from 70 and above are only reported at the national level, so we assume that the regional population growth in relative terms for this aggregate age group was the same as at the national level.
When infant mortality is only reported at the national level, we assume that the regional change is the same at the national level.