Reliability and Integrity of Forest Sector Statistics—A Major Constraint to Effective Forest Policy in Russia

: Russia owns one-ﬁfth of the world’s forest-covered area but has never been the leader of the global forest sector nor in gross output or relative productivity. The issues of the Russian forest sector have attracted research attention, but for many topics, this is still a green ﬁeld on the map of sectoral studies. We developed a novel approach to understand the primary causes of the inefﬁciency of the Russian forest policy through the qualitative assessment of completeness and reliability of forest sector-related data. The main output of this paper is a thorough overview of the available sources of data with an assessment of their quality, completeness and reliability. We show that the Russian ofﬁcial forest sector statistics provide only basic indicators for very short periods with few observations being incomplete and inconsistent. Besides a critical analysis of the ofﬁcial statistics, we also discover some known, but still underemployed, resources of information on the Russian forest sector: textual information of ofﬁcial public bodies and companies, accounting records, remote-sensing data, etc. Finally, we discuss the possible ways to improve the data procurement of the forest sector in Russia to support future decision-making. We are convinced that a prerequisite for the implementation of effective forest policy in Russia is a signiﬁcant expansion and improvement of the volume and quality of statistics on the dynamics of Russian forests and forest economy. Integration of existing and new data sources is necessary to achieve synergistic effects, both in terms of deepening the understanding of key business processes in the industry and in the sense of solving strategic tasks of its development.


Introduction
Digitalization of the modern economy requires new approaches to dealing with information, even in traditional sectors, such as forestry and forest industry [1]. The amount and quality of available information on every small detail of sector development become a competitiveness factor if there is a goal to establish and follow sustainable practices.
Nonetheless, this problem is not in the primary academic or industrial focus. The lack of forest sector statistics seems to be an important problem, not only for an understanding of specific country-level processes but also for international comparisons and policy implications. For instance, the widely used data sets from FAOSTAT, the UN Food and Agricultural Organization database, were proven to contain systematic errors in records on production, imports and exports of some forest products in many countries [1,2]. Some observations for basic indicators are missing, even for such developed countries, such as Canada and Japan [3]. The lack of robust data on different ecological and economic aspects of forest dynamics is of crucial importance for policymaking that aims to achieve the goals of sustainable development, in terms of the United Nations [4]. The interlinkages between data governance and forest governance need a further detailed investigation as well [5].
The quality and diversity of the Russian official statistical information are also matters of concern for scholars due to a set of widely known issues described in a qualitative way [6][7][8]. It is also worth mentioning that due to few similar institutional features, the Russian statistical system and its shortcomings resemble those of China [9][10][11][12], but still are not covered by the quantitative macro-analysis and specific sector-level studies.
Russia owns 1/5 of the world's forest-covered area but has never been the leader of the global forest sector nor in gross output or relative productivity. According to the recent data from FAO, Russia accounted only for 6% of wood removals in 2018 (218.4 mln cub. m), behind the USA (11%), India (9%), China (9%) and Brazil (7%) [13]. According to the official data, most parts of Russian forests are managed (88.8% in 2018) (NIR 2019), however, Russia uses only 30% of its allowable cut leaving a gross amount of forest resources out of business [14][15][16].
The reasons for the current state of the Russian forest sector are discussed in the literature [17]. First of all, the Russian government is poorly involved in sector regulation [18][19][20][21]. In recent decades, there were several changes of sector legislation, but they never brought positive institutional development, neither implementation of sustainable forest management practices. Most of Russian commercial forests are governed under the predatory regime where clear cuts are not balanced with appropriate reforestation and afforestation activities [18,19].
The Russian forest sector is unprofitable, even for the government. As of 2016, the Russian federal budget has spent RUB 59.5 B to maintain forest management raising only RUB 29.7 B as stumpage fees-the net loss was 50% [14]. The analysis of dynamics of this ratio shows that this tendency holds for a very long period indicating that the situation is acceptable for policymakers.
Most researchers state that the main institutional source of the problems described above is the pronounced path-dependence derived from the resource-intensive and environmentally indifferent forest policy of the Soviet period [22]. These adverse initial conditions were enhanced by the fast centralization of the forest management in the 2000s-a reform that has produced more negative than positive consequences [17,23].
Beyond the in-country forest sector agenda, there is also a growing interest in the potential impact of climate change on forestry [24][25][26][27]. It could only be satisfied if the necessary amount of data is available for massive calculations based on the state-of-theart models.
An important reason of forest sector crisis in Russia is the deterioration of the forest science system after the collapse of the Soviet Union in the 1990s when a few strong academic and industry research institutes ceased to exist or dramatically lost their potential. The main consequence is the lack of data and research results that are needed to understand the dynamics and future of forest economics in Russia. Many important topics that could give important knowledge for policymaking are still undercovered by appropriate research. E.g., despite there being very high-quality empirical studies on typology of the principal stakeholders in many European post-Soviet countries, the corresponding works on Russia are completely missing.
In our study, we introduce a novel approach to understand the in-depth nature and genesis of factors limiting the development of the Russian forest sector. The key assumption of our study is that one of the main reasons for the crisis in the Russian forest sector is the lack of adequate and high-quality statistics and studies on different aspects of its development.
This paper aims to give a systematic overview of the key data sources on the Russian forest sector and describe the openness and completeness of statistics on different topics that are important for policymaking aimed at sustainable forest management goals.

Materials and Methods
We suggest the use of the following classification for main sources of open data on forest management and forest economics in Russia:
It should be stressed that only digitalized data are in the focus of our research. This assumption constricts the period of observations to the years after the collapse of the Soviet Union, as due to some unobvious reasons, the "old" statistics (i.e., the statistics of the Soviet period, between 1922 and 1991) are absent in the open official publications. If needed, such information could be retrieved from then-dated statistical books and reports (e.g., [31]).
Since 2011, Rosstat, the official public statistics body of the Russian Federation, has launched the Unified Interagency Information and Statistical System (EMISS) project [32] that integrates the statistics provided by all the federal public authorities. The purpose of this project is the gradual integration of all official state statistics on a single platform that provides a unified interface for data access. We employ EMISS as the main interface to Rosstat data, as it contains more up-to-date statistics and thus provides more recent data rather than conventional Rosstat data books. The use of EMISS is similar to any modern open database with a user-friendly web interface and advanced capabilities of full-text and structured search, spreadsheet-like view and multi-format output of the final sample of data.
A usual approach to work with these data is to search the needed statistical indicators by common keywords (such as forest or wood) or to filter the records by the corresponding public body (e.g., (Rosleskhoz)). For our research, we employed another method making an indicator-to-indicator selection from the whole database (Section 3.1). At the first stage, we select all the indicators that are connected with the forest sector accomplishing the following routines:

1.
Omitting the indicator that contains less than three observation periods, most of them were included in the official statistical observation only to accomplish some tactical task, but not to establish a new indicator that could produce a reliable time series.

2.
Aggregation by the similar indicator goal, as there are multiple cases of slight changes of indicator titles over the course of observation.

3.
Minor groupings of related indicators could also be done, for the sake of simplicity and easy representation.

4.
Omitting the indicators that do not interfere with main forest-related activities (such as Turnover of public food, i.e., the total turnover of canteen services for employees of forest industries).

5.
Omitting the indicators that could be calculated using the other ones presented in the dataset, e.g., if there are data on total forested area, and the total area is also known, the indicator for relative land area forestation is redundant and needs to be omitted.
At the second stage, the indicators are clustered by the following topics: Companies: demography, Companies: business, Labor, Products and Prices, Lands and Growing Stock, Reforestation, Disturbances, Forest Protection, Forest Management.
The indicators may be observed for different sections: federal (or national), federal districts and regions. We do not account for sub-national, federal districts data, due to the following reasons: (a) these bodies are mostly political and almost do not interfere with economic activity, (b) the composition of these bodies is unstable and has changed several times during previous decades, (c) it is not evident how to use the implications from the analysis of federal districts data to the solution of real problems, as these bodies do not have any decision-making authorities.
Study and analysis of other data sources were made using classical descriptive methods (Sections 3.2-3.5).

Official Public Statistics
The EMISS database contains 7010 different indicators. The in-depth analysis was aimed at indicator selection using the approach described in the previous section, which led to 1044 indicators after the first stage. The table with raw data contains 18,477 filled cells with the following columns: title, units of measure, full description, observation period and section, link, and official service, which is responsible for this indicator.
After the second stage, only a few dozen aggregated indicators remain in sight. For convenience, we split the list of reviewed statistical indicators into two separate tables: one for economic data (Table 1) and the second for the indicators on forest management ( Table 2).  It is evident that the Russian official forest sector statistics provide only basic indicators, and time series are still very short. For the most part of indicators, only 10-15 years of observations are available, which is usually a very short period for time series modelling.
During the last years, monitoring of a new set of indicators has started, but it is not obvious if they will be observed in mid-or long-run perspective, e.g., there are detailed data on different topics in forest health protection (such as the estimates of economic losses due to forest pest outbreaks), but only the values for the last few years are available, so it is not possible to use it as a calculation-ready time series. Some indicators are "orphaned", i.e., there are only one or two observations in previous years, but it seems that the indicator is not maintained anymore.
Random sample check and the accumulated previous experience show that for a sufficient share of indicators there is also a problem of inconsistency and incompleteness of the data. The total check and quantitative assessment of the share of missed observations could become a subject of further research, as this task is quite large (e.g., the full stack of data for only one indicator may exceed several thousands of observations). Observations for some years are missing. In other cases, there is obvious incoherence, e.g., the indicator may double for only one year, staying almost stable during the other periods. In both described cases, no explanations are provided, so it is reasonable to question at least part of the observations in such indicators.
It is worth emphasizing that we did not find the direct use of these new indicators in the most important current publicly available policymaking documents of the forest authorities.
Despite the mentioned shortcomings, it is fair to emphasize that the total statistics revealed during this study cover a few topics that are not usually discussed in the academic literature. Thus, there is potential to sufficiently develop the research agenda to these topics. First, that is true for the study of company-scale dynamics of the forest sector. The worldwide interest in this topic is growing [33][34][35] but has not yet been touched upon in Russian studies. Second, the problem of forest pest management is also of great practical importance, especially for Siberian forests, but is very moderately covered in the literature.

Data on International Trade
Data on international trade of forest products are available through two major sources: FAOSTAT resource [36] and database of the Federal Customs Service of Russia (FTS) [37].
FAOSTAT is the commonly used international freely available database, which provides, inter alia, the aggregated macro-level data on trade flows of nine types of forest products between 245 world countries since 1997. The primary source of FAO data is always the authorized national services, so, usually, the FAOSTAT data may also be retrieved from some local and native-speaking resource. However, the advantage of accessing the data through FAOSTAT is the user-friendly interface and the possibility to track all the necessary trade flows in one place, which is of crucial importance when making the cross-country comparisons.
The high level of data aggregation of these data makes them almost useless when dealing with specific research issues focused on a single country. In such cases, the data from local sources, such as FTS, are much more appropriate. The FTS database includes a large set of export and import monthly statistics, accounted for both in physical and monetary terms. In addition to national data, there are also statistics on foreign trade of Russian regions. All data are classified with the Harmonized Commodity Description and Coding System and are available for up to ten-digit codes. In comparison, the UN Comtrade database [38] is limited to a six-digit code. Using disaggregated statistics can help to examine the trade patterns more thoroughly with a focus on specific commodities. For example, using eight-digit codes is necessary to clear the data on such wooden products as furniture from the presence of other materials such as metal or plastic. This approach was developed and applied to assess the competitiveness level of Russian forest industry products [15].

Textual Information
An important, but very underestimated, source of data on forest economics is the textual information that may be distilled from the official press releases of both public bodies and companies, as well as industry magazines, social media and public forums.
Since the 1990s, industry magazines became a very influential communication platform for industry insiders, especially on the business side. The most important title, Lesprominform, publishes full issues in PDF format after a short embargo period (2-3 months). The website of the magazine (they are also presented on Facebook, VK, Instagram social media) is a major resource of the Russian forest sector news and updates. In addition to interviews and companies' press releases, they also publish some pieces of the latest statistics on different aspects of forest sector activity.
There is no evidence of complex analysis of the above-listed source, but this idea has good potential for retrieving some new and independent data of the real situation inside the industry.

Accounting Records
In Russia, the Federal Tax Service (FNS) provides only general information on registered companies, but not the accounting records. These data may be accessed through different web services for a reasonable price (Table 3). The poorest extra data sources Source: developed by authors. * as the official price list is not disclosed, the price estimate from an unreported source is used.
The main advantage of this data source is that it opens new directions of data analysis of forest sector companies, as almost all the services provide different extra information datasets in addition to the usual accounting records. Most of them summarize the list of arbitration cases, government contracts, media references and other information that could be useful for further studies.
All these systems are primarily developed for counterparty screening, so there are some limitations for scientific use. The most important limitation is that the access API is too expensive (e.g., for the cheapest service, SBIS, it will cost more than RUR 1 M ≈ USD 12,600 per 100,000 companies). As a manual export of data into a spreadsheet format is also not provided, the only reasonable way to sample companies is to manually save the necessary information from a company account. This is a major obstacle to making big samples (N > 100).

Remote Sensing Data
The openness of the global satellite image data allows one to cross-check the official government data. As this is a very resource-intensive task, the corresponding studies are still sparse but tend to grow up in subsequent years [39][40][41].
Despite the growing tendency of remote-sensing-driven ecological forest studies in Russia [42][43][44][45][46], the economic aspects are almost not covered within this topic.
Nevertheless, it is the remote sensing data that can provide an invaluable source of primary observations for economic research. Such an interlinkage of methods widely used in forest management and the aims of economic studies are needed to understand the real situation with forest use. Special attention to the problems of shadow cut is needed, as this problem is of great importance in many remote theatres of logging activities.
A valuable source of preprocessed satellite data on tree cover dynamics, since 2001, has been the Global Forest Watch project [47]. Using these data will require time-consuming work, as they are not available in raw formats. Collection and quantification of these data linked to regional-and firm-scale data may create strong perspectives of research with policymaking outcomes.

Conclusions
We develop a novel approach to understanding the primary causes of inefficiency of the Russian forest policy through the qualitative assessment of completeness and reliability of forest sector-related data.
Our analysis showed that the Russian official forest sector statistics provide only basic indicators for very short time periods. Many indicators contain inconsistent and incomplete data sets. However, even the existing data are not fully employed nor for academic research, neither for policymaking. An important finding is that the indicators newly registered by the official statistics in recent years are not used for real policymaking workflows. The statistics of the Soviet period (pre-1991) are not digitalized but could be accessed in statistical books and reports of then-existing industry research institutes. The data are much poorer in terms of a variability of indicators but seem to be more reliable.
A new data set could be compiled using the classification of official statistical indicators that we suggested in this paper. It aggregates all the data that one could extract from the official statistics. Most of these data were highly likely unused to analyze the practical issues of the forest sector in Russia, so it becomes possible to acquire a piece of new knowledge on different aspects of sector development.
A very important source of forest sector data is international trade statistics. The Russian Federal Customs Service (FTS) provides a large set of data that could be used to track trade flows of forest products with a high detail level (six-digit codes according to Harmonized Commodity Description and Coding System).
Textual information distilled from official press releases and other open web-sources (industry magazines, social media, public forums) is a data source with underestimated potential. As the quantity of such information is growing up, the necessity of its employment for research will become more and more relevant.
Accounting records are widely used in other economic research but were never employed to study some aspects of the Russian forest sector economic problems. These data are not available for free, but the price is feasible.
Remote sensing data should also be widely employed for tracking the natural and anthropogenic forest dynamics. There are numerous sources of freely available data, but its processing requires time-intensive work.
We argue that a prerequisite for the implementation of effective forest policy in Russia is a significant expansion and improvement of the volume and quality of statistics on the dynamics of Russian forests and forest economy. It seems to be quite a difficult task, which should involve the representatives of both academic and business communities. The practical use of the data sources analyzed above can contribute to the first steps towards this task.  10.2020 entitled "Socioeconomic development of Asian Russia based on the synergy of transport accessibility, system knowledge about natural resource potential, expanding space of inter-regional interactions".
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://www.fedstat.ru/.

Acknowledgments:
The authors express their deep gratitude to the three anonymous peer reviewers; their extensive comments were valuable to sufficiently improve the original manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations
The