Spatial-Temporal Characteristic Analysis of Ethnic Toponyms Based on Spatial Information Entropy at the Rural Level in Northeast China

As a symbol language, toponyms have inherited the unique local historical culture in the long process of historical development. As the birthplace of Manchu, there are many toponyms originated from multi-ethnic groups (e.g., Manchu, Mongol, Korean, Hui, and Xibe) in Northeast China which possess unique cultural connotations. This study aimed to (1) establish a spatial-temporal database of toponyms in Northeast China using a multi-source data set, and identify their ethnic types and origin times; and (2) explore the geographical distribution characteristics of ethnic toponyms and the evolution of rural settlements by comparing the spatial analysis and spatial information entropy methods. The results found that toponyms reflect not only the spatial distribution characteristics of the density and direction of ethnic groups, but also the migration law of rural settlements. Results also confirm that toponyms contain unique cultural connotations and provide a theoretical basis for the protection and promotion of the cultural connotations of toponyms. This research provides an entropic perspective and method for exploring the spatial-temporal evolutionary characteristics of ethnic groups and toponym mapping.


Introduction
Toponyms not only represent geographical locations, but also represent spatial entities. Spatial scope is defined by geographical coordinates. In different time scales, in addition to their original meanings, there are also historical, cultural, social, and national meanings endowed by local residents. As carriers of geographic information, toponyms interpret the evolution of geographical environments at a small scale and play important roles as mediums in the research field [1,2]. This has such significant value for studying the history of a region that Fudan University and Harvard University collaborated together on the establishment of the China Historical Geographic Information System (CHGIS) project which aims to create a set of basic geographic information for the study of continuous changes in Chinese historical periods, as well as providing a GIS data platform, time statistics, searching tools, and models for related researchers [3][4][5][6]. As the birthplace of Manchu, Northeast China is home to Mongol, Han, Korean, Xibe, and other ethnic groups. As symbols of precious cultural heritage, the toponyms not only preserve the abundant language materials and accumulate a unique cultural connotation to a certain extent but can also reflect the local cultural characteristics, can serve as "living fossils" of the study of history and culture, and describe the history of all ethnic groups' cultural blending and

Data Sources and Processing Methods
We developed a database of toponyms (including details of the origin and other characteristics) for Northeast China based on the National Database for Geographical Names of China [22], mainly including urban residential areas, rural residential areas, village committees, agricultural, forestry and pasture sites, industrial areas, public institutions, party, and government organizations. In addition, this database contains the Northeast China provinces, city, and county toponyms, the toponyms dictionary, statistical yearbook, and related historical documents. The borders of the county-level administrative divisions in Northeast China used in this paper were obtained from the county-level administrative boundaries provided by the National Geomatics Center of China [23]. Using database management software to extract keywords, it screened for China's Northeast toponyms of various etymologies, avoiding many Manchu, Mongol, Korean in autonomous' county and townships unrelated to Manchu, Mongol, Korean cultures of toponyms, and has extracted the real Manchu, Mongol, Korean, Hui, and Xibe related toponyms.
According to the following sequence of Chinese historical dynasties: Jin (1115-1234 CE); Yuan (1271-1368 CE); Ming (1368-1644 CE); Later Jin (1616-1636 CE); Qing (1636-1912 CE); Warlord Conflict Period (1913( -1930; Manchukuo Period (1931( -1945; and Eve of the Founding of PRC (1946( -1949; we extracted the timeline showing when each rural settlement appeared for all rural settlements. We note that toponyms in the Jin, Yuan, Ming, Later Jin, and Qing dynasties were based on the extraction of key words related to these dynasties. For toponyms in different areas, there may be a small overlap in the timeline. Among them, the Qing Dynasty was the period of great prosperity of the Manchu, and the expansion of rural settlements in the research area was extremely obvious. Therefore, for the Qing Dynasty, the rural settlements were further subdivided into the Early (1636-1795 CE) and Late (1796( -1912 Qing dynasties according to the emperor's year numbers. The Warlord Conflict Period was set at 1913 to 1930, the Manchukuo Period was set at 1931 to 1945, and the Eve of the Founding of PRC was set at 1946-1949. The classification of rural settlements by timeline is shown in Table 1.

Data Sources and Processing Methods
We developed a database of toponyms (including details of the origin and other characteristics) for Northeast China based on the National Database for Geographical Names of China [22], mainly including urban residential areas, rural residential areas, village committees, agricultural, forestry and pasture sites, industrial areas, public institutions, party, and government organizations. In addition, this database contains the Northeast China provinces, city, and county toponyms, the toponyms dictionary, statistical yearbook, and related historical documents. The borders of the county-level administrative divisions in Northeast China used in this paper were obtained from the county-level administrative boundaries provided by the National Geomatics Center of China [23]. Using database management software to extract keywords, it screened for China's Northeast toponyms of various etymologies, avoiding many Manchu, Mongol, Korean in autonomous' county and townships unrelated to Manchu, Mongol, Korean cultures of toponyms, and has extracted the real Manchu, Mongol, Korean, Hui, and Xibe related toponyms.
According to the following sequence of Chinese historical dynasties: Jin (1115-1234 CE); Yuan (1271-1368 CE); Ming (1368-1644 CE); Later Jin (1616-1636 CE); Qing (1636-1912 CE); Warlord Conflict Period (1913( -1930; Manchukuo Period (1931( -1945; and Eve of the Founding of PRC (1946( -1949; we extracted the timeline showing when each rural settlement appeared for all rural settlements. We note that toponyms in the Jin, Yuan, Ming, Later Jin, and Qing dynasties were based on the extraction of key words related to these dynasties. For toponyms in different areas, there may be a small overlap in the timeline. Among them, the Qing Dynasty was the period of great prosperity of the Manchu, and the expansion of rural settlements in the research area was extremely obvious. Therefore, for the Qing Dynasty, the rural settlements were further subdivided into the Early (  Subsequently, ArcGIS software was used to process the data, and the toponyms of rural settlements located in different timelines were classified and displayed to observe more intuitively the expansion and migration of rural settlements in Northeast China. There are 609,507 toponymy points in the database, 83,320 rural settlements, and 13,721 ethnic toponyms were extracted. The statistical results are shown in Table 2. After the projection (WGS_1984_Web_Mercator_Auxiliary_Sphere) conversion of the research area in ArcGIS, a grid size of 30 × 30 km was used to generate a grid, and a total of 3,277 grids were produced. We counted the number of toponymy points in each grid and the toponymy points related to ethnic groups, ranked and matched the color according to the number of toponymy points, filled the interior color of the grid according to the number of toponymy points in each grid, and filled the border color according to the number of toponymy points related to ethnic groups in each grid. The resulting diagram of layered color settings is shown in Figure 2.
area in ArcGIS, a grid size of 30 × 30 km was used to generate a grid, and a total of 3,277 grids were produced. We counted the number of toponymy points in each grid and the toponymy points related to ethnic groups, ranked and matched the color according to the number of toponymy points, filled the interior color of the grid according to the number of toponymy points in each grid, and filled the border color according to the number of toponymy points related to ethnic groups in each grid. The resulting diagram of layered color settings is shown in Figure 2.

Technical Processing Flow
The technical flow chart is shown in Figure 3. This study used GIS spatial analysis to calculate the spatial information entropy based on geographic information, including map information entropy, geometrical distribution, topological adjacency, and thematic map information. In general, these toponyms reflect China's northeast Manchu and the related time and space distribution characteristics of the nation. Therefore, analyzing the spatial distribution of ethnic toponyms influenced by different factors and exploring the historical background of the relationship between people and the environment is intended to reveal the profound influence of the modern northeast immigration of Manchu culture and explain the national fusion and the evolution of the migration process. The spatial distribution of ethnic minority toponyms was generated using the kernel density estimation method, and a correlation analysis was performed with the range and direction distribution of standard deviational ellipse. The characteristics of temporal and spatial changes of rural settlements in Northeast China were analyzed as were the factors affecting rural settlements expansion and migration. The spatial distribution characteristics of toponyms and cultural landscapes were analyzed from the perspective of geographical regions, and the ancient and modern maps were compared to reflect the migration and cultural development of ethnic groups in Northeast China, and the historical changes of rural settlements were analyzed from the perspective of history, indicating the expansion and migration path of rural settlements. Through the outcome of the resulting map, the historical landscape and cultural landscape of Northeast China are visually presented. expansion and migration. The spatial distribution characteristics of toponyms and cultural landscapes were analyzed from the perspective of geographical regions, and the ancient and modern maps were compared to reflect the migration and cultural development of ethnic groups in Northeast China, and the historical changes of rural settlements were analyzed from the perspective of history, indicating the expansion and migration path of rural settlements. Through the outcome of the resulting map, the historical landscape and cultural landscape of Northeast China are visually presented.

Kernel Density Estimation
As there are many data points in the study area and the spatial autocorrelation dominates the whole, a spatial interpolation model based on spatial autocorrelation was selected. With enough sample points, the error of the kernel density estimation (KDE) method was relatively small, and so it was deemed suitable to calculate the sample point density [24]. This method does not use the prior knowledge of data distribution and does not attach any assumptions to the data distribution; rather, it is a method for studying the characteristics of data distribution from the data samples themselves. Therefore, it is highly valued in the field of statistical theory and application. This method uses a smooth peak function (the "kernel") to fit the observed data points to simulate the real probability distribution curve; it is a non-parametric method used to estimate the probability density function. According to the sample point group of a single variable, the estimated value of its spatial smoothness

Kernel Density Estimation
As there are many data points in the study area and the spatial autocorrelation dominates the whole, a spatial interpolation model based on spatial autocorrelation was selected. With enough sample points, the error of the kernel density estimation (KDE) method was relatively small, and so it was deemed suitable to calculate the sample point density [24]. This method does not use the prior knowledge of data distribution and does not attach any assumptions to the data distribution; rather, it is a method for studying the characteristics of data distribution from the data samples themselves. Therefore, it is highly valued in the field of statistical theory and application. This method uses a smooth peak function (the "kernel") to fit the observed data points to simulate the real probability distribution curve; it is a non-parametric method used to estimate the probability density function. According to the sample point group of a single variable, the estimated value of its spatial smoothness was calculated. The KDE has n sample points with independent and identical distribution F, and its probability density function, f, and can by calculated by Equation (1): where K(.) is a pre-given inverted u-shaped function, called the kernel function; and h > 0 is a smoothing parameter, called bandwidth, used to define the size of the smoothing quantity, which is actually a radius of a circle centered on x; and K h (x) = 1 h K( x h ) is the scaled kernel. Kernel density analysis can calculate the density of the element in its surrounding neighborhood and the point element density around each output grid pixel. Each point is covered with a smooth surface. The surface value is the highest at the point location and gradually decreases as the distance from the point increases [25]. The surface value is zero at the location where the distance from the point is equal to the search radius. The volume of space enclosed by the surface and the plane below is equal to the population field value of this point, and the density of each output grid pixel is the sum of the values of all the core surfaces superimposed on the grid pixel center.

Standard Deviational Ellipse
The KDE mainly obtains the density distribution map of place points that have a distribution direction which also requires using the standard deviational ellipse (SDE) method, which can represent the direction and distribution of data [26]. The calculation principle uses the center of arithmetic mean to calculate the ellipse center. As shown in Equations (2) and (3), X i and Y i are the spatial position coordinates of each element, and SDE x and SDE y are the calculated ellipse centers. Then, the direction of the ellipse is determined taking the x-axis as the standard. Due north is 0 degrees. Finally, the length of the x,y-axis is determined, as shown in Equation (4): The long and short half-axes of the ellipse represent the data distribution direction and range, respectively. The shorter the axis, the more obvious the centripetal force. The center point represents the center position of the full set. Generally, the center point position is roughly consistent with the position of the arithmetic average, as long as the data variability are not great. When elements have a normal spatial distribution (that is, they are most dense at the center and gradually become sparse as they approach the periphery), the first standard deviational (default value) range can include the center of mass of the input elements, accounting for approximately 68% of the total.

Map Information Entropy
Entropy is a quantitative measure of the amount of information contained in a message. Shannon (1948) defined the concept of information entropy based on the probability theory of mathematical statistics, which was used to represent the degree of uncertainty eliminated after obtaining a piece of information and to measure the size of information [27]. The more uncertain the message, the more information it contains. This method is statistically rigorous. Sukhov, a professor in the former Soviet Union, initially portrayed Shannon's information entropy in cartography to assess map information [28]. N is the total number of symbols in the map, M is the total number of map symbol classes, the number of such symbols is k i , and the frequency of each symbol type is as in Equation (5). Taken in the information entropy calculation Equation (6), the map information of each type of symbols can be derived [29]: This narrow sense of information entropy is only based on mathematical statistics and cannot reflect the spatial distribution relationships in maps, because spatial information is not only simple statistical and topological information but also may contain geometrical and thematic elements. When designing quantitative measures of spatial information, the spatial position and distribution of map symbols should also be considered. Therefore, Li and Huang [30] used the Voronoi region of symbols to model the map symbol spatial distribution and then established a new quantitative measurement method for map spatial information. With the Voronoi diagram, discrete map elements are shown organically, which advances the geometrical information [31] (including the position of map elements, the information, such as quantity, size, and shape), the topological information (including Entropy 2020, 22, 393 9 of 21 the topological map elements, direction, distance, and distribution of information), and project of information (including the type of map elements, importance degree, and other information).
Liu Huimin et al. [32] slightly improved Li and Huang's geometrical information, reflecting the degree of uniform distribution continuity through the difference in point element distribution density, and then proposed geometrical distribution information and topological adjacency information. In addition, a Delaunay triangulation net was established based on the Voronoi adjacency relation of the point elements, and the edges of the triangulation net were subject to the statistical distance constraint from whole to local in turn, which interrupted the inconsistent edges in the triangulation net and obtained a series of subgraphs, each of which formed a spatial cluster, or cluster structure. By counting the number of point elements, distribution area, and distribution density of each cluster structure, the information of the cluster structure can be calculated.

Information Entropy Calculation Method
Geometrical distribution Information. Map information is generated from the diversity and difference of spatial elements and their distribution features. The distribution map of toponyms and points in this study is a point-like map, whose geometrical distribution features are shown as the degree of aggregation, and distribution uniformity which can be reflected by the distribution density differences of the point elements [33]. This point element distribution density is determined by the relationship between the distance and direction of adjacent points and is inversely proportional to the area of the corresponding region. The larger the area, the sparser the element distribution, the fewer elements there are overall, and the less corresponding information there is. The larger the regional area difference, the more uneven the point distribution and the richer the distribution form, providing more information. Thus, it can be seen that the geometrical information generated by the geometrical distribution characteristics of point elements was negatively correlated and positively correlated with the regional area of point elements and the size of their differences, respectively.
When calculating the spatial information content of a point-shaped thematic map, first, a constraint diagram is constructed for the point elements of the thematic map to obtain the Voronoi region corresponding to each point element, and then the Voronoi region area corresponding to each point element is calculated. The Voronoi region area is used to calculate the geometrical distribution information content of the map. Because the Delaunay triangulation method used by the tool is most suitable for data in a projected coordinate system, the geographic coordinate system is first transformed into a projected coordinate system. The irregular triangulations (TINs) conforming to the Delaunay criterion were divided into all the points [34]. The perpendicular bisectors of the triangle sides form the sides of a Voronoi polygon. The intersection of the bisectors determines where the Voronoi polygon folds.
The area is taken as the description index of the geometrical distribution characteristics, and the feature-based information calculation model is substituted to obtain the expression of the geometrical distribution information of the point-shaped thematic map as I G , as shown in Equation (7), where V area corresponds to the element of the i-th point; and V Areai and V Areamax are the average and maximum values of the area. The units of information obtained by logarithm base 2 are bits. The geometrical distribution information is generated by the distribution density of point elements and its difference: Topological Adjacency Information. The topological adjacency information of the map is calculated by using Voronoi order adjacency as a description index. In terms of the topological adjacency characteristics of the spot thematic graph, the topological adjacency of the graph is the main adjacency relation, which is generally expressed by the topological adjacency degree. The degree of topological adjacency reflects the element connectivity, and the difference in adjacency reflects the difference in spatial connectivity. The larger the difference, the greater the topological adjacency information [35].The degree of adjacency selects the first-order topological adjacency degree of the Voronoi region corresponding to the point element as the description index of the topological adjacency information measurement of the point element, and normalizes it by adding the information model based on the feature to obtain the point map topological adjacency information generated by the topological adjacency feature, as shown in Equation (8), where, t i is the first-order adjacency of Voronoi region corresponding to the i-th point element: Thematic Information. The thematic information of the map is related to the feature theme type. If all the neighbors [36] of a symbol are of the same theme type, the information of the symbol in the theme sense is very low [37]. However, if a symbol has a neighborhood of different theme types, it should be considered to have higher thematic information. We let the i-th map symbol have N i first-order adjacent symbols, which have M i theme types, where, the number of symbols of the i-th thematic type is N i . Then the frequency of the i-th neighbor symbol containing the j-th thematic attribute type is defined as Equation (9). The thematic information of the i-th map symbol is H i (TM), and the calculation formula is shown in Equation (10). If all the symbol neighbors are of the same theme type, the calculated result value is 1, so the thematic information is counted by 0:

Geographical Distribution of the Ethnic Toponyms
The KDE and SDE analysis results of the Manchu are shown in Figure 4a. Because there are a high number of toponyms related to Manchu in Northeast China, the distribution direction is not obvious. Therefore, rural settlements related only to Manchu were selected for analysis to better represent the directional characteristics of Manchu geographical distribution. The kernel density analysis of Manchu-related toponyms found that the highest density was located in the eastern part of Liaoning Province and the southern part of Jilin Province. The first SDE contained 68% of the data points, and the second SDE contained 95% of the data points. The analysis results of the two ellipses had the same direction which were distributed along the "wicker border" edge, which refers to the dyke ditches built in Northeast China in the second half of the 17th century. The Qing Dynasty (1636-1912 CE) saw the rise of the Manchu northeast as: "ancestral king's house," "dragon born land". To protect the "benefits of the sacred region" in this area from being destroyed by the North Koreans and to prevent the invasion of the foreign vassal Mongolia, they dredged the border trenches in the northeast area and planted willows along the trenches, which are called the willow edge [38]. Manchu rural residential areas are more densely distributed, the highest density is in the inside of the willow edge, and then gradually spreads to the outside. Their settlements were concentrated in a large area; so, the Manchus were the dominant ethnic group in the study area.
The Hui's KDE and SDE analysis results are shown in Figure 4d. Hui is distinguished from the previous three groups by the scattered distribution of the group across the study area. The Hui is external, and the overall distribution range is very wide but has no obvious distribution direction or expansion trend.
The KDE and SDE analysis results of the Xibe are shown in Figure 4e. The Xibe is a typical compact ethnic group in a small area, mainly distributed in Liaoning Province.

Expansion and Migration of Rural Settlements
According to the historical timeline, the core density of rural settlements between the former and the new dynasties was compared. If the density of the new dynasty was higher than that of the The KDE and SDE analysis results of Mongol are shown in Figure 4b. The Mongol are typical herding nomadic people, mainly distributed in Inner Mongolia but without obvious directionality. The distribution is extremely concentrated, and the maximum kernel density is the largest among the five ethnic groups, so its distribution is the densest.
The KDE and SDE analysis results of the Korean are shown in Figure 4c. The toponyms of the Korean are mainly distributed in the border between Jilin Province, Liaoning Province, and North Korea. Geographically, each Korean community is relatively small but dense, and several Korean communities are distributed in the Manchu community.
The Hui's KDE and SDE analysis results are shown in Figure 4d. Hui is distinguished from the previous three groups by the scattered distribution of the group across the study area. The Hui is external, and the overall distribution range is very wide but has no obvious distribution direction or expansion trend.
The KDE and SDE analysis results of the Xibe are shown in Figure 4e. The Xibe is a typical compact ethnic group in a small area, mainly distributed in Liaoning Province.

Expansion and Migration of Rural Settlements
According to the historical timeline, the core density of rural settlements between the former and the new dynasties was compared. If the density of the new dynasty was higher than that of the previous dynasty, it was regarded as the migration of the rural settlement center. In the legends of all the pictures in this section, the "Highest Density Region" represents the region with the largest kernel density according to the calculated results after applying kernel density analysis.
In 1115, the Jin Dynasty was founded as the capital of Huining prefecture (now Harbin, Heilongjiang Province), and its name was Dajin (1115-1234 CE). It was a feudal dynasty in Northern and Northeastern China, established by the Nvzhen. Through kernel density analysis, it can be concluded that the rural settlement center distribution was indeed located in Harbin and gradually spread to the southwest around the center. Rural settlements in the Yuan Dynasty (1271-1368 CE) were concentrated in Hebei Province and gradually spread to Liaoning. Compared with the Jin dynasty, it is evident that the rural settlements migrated, and the migration distance was large which was related to the fact that the Nvzhen were nomadic people at that time. Rural settlements in the Ming Dynasty (1368-1644 CE) were concentrated in Hebei Province, similar to those in the Yuan Dynasty but were larger in scope. In general, during the period from the Jin Dynasty (1115-1234 CE) to the Ming Dynasty (1368-1644 CE), it is evident that the rural settlements migrated southward as shown in Figure 5a.  In 1616, Nurhachi, as the founder of the Qing Dynasty, founded the Later Jin Dynasty (1616-1636 CE). His son Huangtaiji changed the country name to Qing Dynasty in 1636. According to the kernel density analysis, the rural settlements were mainly distributed in Liaoning, which is consistent with modern historical interpretations, because its capital was located in Hetuala (today's Fushun city, Liaoning Province). The rural settlements migrated again, and the density center migrated to Liaoning Province. In the Shunzhi years (1638-1661 CE), rural settlements were concentrated in Liaoning Province, similar to that in the Later Jin Dynasty but with a larger central area. Therefore, from the Ming Dynasty (1368-1644 CE) to the Early Qing Dynasty (1638-1795 CE), the rural settlements migrated northward. The migration map is shown in Figure 5b.
Under the rule of 11 emperors in the Qing Dynasty, rural settlements in the research area continued to expand. During the reign of Jiaqing years (1796-1820 CE), a new center of the rural settlement began to focus on Jilin province. The possible reasons being that the rural settlement in Liaoning Province was too dense, it did not have much forest land available to the west for the Mongolian territory and to the south and eastern sea. Therefore, it could only gradually expand northward. In addition, the wicker border, which was originally the boundary between Manchu and other nationalities, was abandoned in the 10th year of Xianfeng (1860 CE), and a large amount of land was unappropriated but available for farming. Therefore, the overall scope of rural settlements in the Late Qing Dynasty (1796-1912 CE) expanded significantly, gradually forming a new rural settlement gathering center. The expansion range from Early Qing Dynasty to Late Qing Dynasty is shown in Figure 5c.
During the Warlord Conflict Period (1913( -1930, the Feng clan warlords mainly concentrated in Liaoning and Jilin; therefore, to avoid the war, the rural settlement center moved north to Heilongjiang Province again. The migration route from the Late Qing Dynasty (1796-1912 CE) to the Warlord Conflict Period (1913( -1930 is shown in Figure 5d.
Because the capital of the "Manchukuo" (the puppet Manchukuo was not recognized by the national government, the communist party of China, or the international community) was established in Changchun, Jilin Province, the rural settlement center moved south to Jilin Province. The migration route from the Warlord Conflict Period (1913( -1930 to Manchukuo Period (1931( -1945 is shown in Figure 5e. Owing to the restoration of peace and stability during the founding of PRC (1949 CE), the rural settlements continued to move south to the more livable Liaoning Province. The migration route from the Manchukuo Period (1931( -1945 to the Eve of the Founding of PRC (1946( -1949 is shown in Figure 5f. This study analyzed the spatial distribution characteristics of toponyms and cultural landscapes from the perspective of geographical regions and breaks through the conventional framework of expressing spatial changes by administrative zoning. In GIS, the spatial smoothing method based on KDE and spatial interpolation technique reflects the spatial changes of toponyms and landscapes. The selection of toponyms with historical significance can reflect the temporal change of toponyms over long periods. Through a series of time series analyses, we can clearly see the expansion and migration of rural settlements in northeast China. In the Jin (1115-1234 CE) and Yuan (1271-1368 CE) dynasties, rural settlements were scattered, with few large settlement areas. In the Ming Dynasty (1368-1644 CE), rural settlements were concentrated in Hebei Province, and settlement areas gradually took shape. In the Later Jin (1616-1636 CE) and the Early Qing Dynasty (1636-1795 CE), there was an obvious migration of the rural settlements, and the density center moved northward to Liaoning province. During the Jiaqing (1796-1820 CE) period, rural settlements began to shift significantly for the second time, and the newly formed rural settlement centers began to be concentrated in Jilin Province. A possible reason for this was that rural settlements in Liaoning were already too dense, and there was not much land left for settling; therefore, they began to expand northward. During the Warlord conflict period (1913( -1930, the rural settlement scope expanded obviously, and the rural settlement center moved northward again. In the Manchukuo Period (1931( -1945, the rural settlement centers began to move back to southward due to the establishment of the Manchukuo. This trend continued until the Eve of the Founding of PRC (1946( -1949, referring to the historical population data set of the History Database of the Global Environment (HYDE) [39] as the evidence of population migration and using the 1950 CE rural data set as shown in Figure 6. northward. During the Warlord conflict period (1913( -1930, the rural settlement scope expanded obviously, and the rural settlement center moved northward again. In the Manchukuo Period (1931( -1945, the rural settlement centers began to move back to southward due to the establishment of the Manchukuo. This trend continued until the Eve of the Founding of PRC (1946( -1949, referring to the historical population data set of the History Database of the Global Environment (HYDE) [39] as the evidence of population migration and using the 1950 CE rural data set as shown in Figure 6. The data in the study area were divided into four levels: Zero to indicate no residents; Low, few residents; Medium, residents starting to gather with medium density; and High, large number of residents gathering with high density. As can be seen from Figure 6, up to 1950, the densely populated areas in the study area were Liaoning province and the central part of Jilin province, showing a relatively obvious banded distribution, which was consistent with the migration results.
The size of each grid in Figure 6 is 1 × 1 km, and the total population of the covered grids in each provincial administrative region is counted and compared with the number of toponyms in rural settlements, as shown in Table 3. It can be seen that there is a certain correlation between population and the total number of rural settlements. The data in the study area were divided into four levels: Zero to indicate no residents; Low, few residents; Medium, residents starting to gather with medium density; and High, large number of residents gathering with high density. As can be seen from Figure 6, up to 1950, the densely populated areas in the study area were Liaoning province and the central part of Jilin province, showing a relatively obvious banded distribution, which was consistent with the migration results.
The size of each grid in Figure 6 is 1 × 1 km, and the total population of the covered grids in each provincial administrative region is counted and compared with the number of toponyms in rural settlements, as shown in Table 3. It can be seen that there is a certain correlation between population and the total number of rural settlements.

Map Information Entropy
Traditional map information entropy is calculated for the toponymy data in the study area, which is more traditional, but ignores the specific form of map information transmission. The calculated results based on Equations (5) and (6) are shown in Table 4. The map information entropy (denoted as I M ) in Manchu toponyms account for the largest proportion, followed by Mongol and Korean. The larger the I M value, the more ethnic toponyms in the region. The results show that the Manchu were mainly distributed in Liaoning, Jilin and Heilongjiang provinces. The Mongol were mainly distributed in Inner Mongolia, while the Korean were mainly distributed in Jilin and Heilongjiang provinces. The Hui are evenly distributed in all provinces, while Xibe are mainly distributed in Liaoning Province.
The calculation of map information entropy is only able to obtain the location information of toponyms and cannot analyze the density of toponyms. It is also meaningful to study the number of rural settlements with respect to timeline. The I M of each dynasty and period are calculated based on Equations (5) and (6), and the results are shown in Table 5. The I M in Qing Dynasty accounted for the largest proportion, followed by Warlord Conflict Period and Ming Dynasty. This is consistent with the statistics in Section 2.2. The provinces of I M showed that most of the rural settlements in Liaoning, Jilin, and Inner Mongolia were formed during the Qing dynasty. Rural settlements in Heilongjiang were mainly formed during the Qing Dynasty and the Warlord Conflict Period, while those in Hebei were mainly formed during the Ming and Qing dynasties.

Geometrical Distribution Information
The toponyms of the Manchu, Mongol, Korean, Hui, and Xibe were generated into Voronoi polygons and are shown in Figure 7. The area of the polygon of each point located in the pair was counted, and the geometrical distribution information entropy (denoted as I G ) was calculated based on Equation (7) as shown in Table 6. The larger the regional area difference, the more uneven the point distribution and the richer the distribution form, thus, the more information. Therefore, it can be seen that Mongolian toponyms have the most obvious clustering characteristics, followed by Manchu, while the Hui, Xibe, and Korean have the same clustering results as KDE result. Therefore, when all neighbors of a polygon are polygons of the same theme type, the thematic information is 0. When the Nj value is the same, the smaller the nj value is, the more important the theme meaning contained in the polygon is, while the Ij value is smaller, that is, the Ij value of a polygon is negatively correlated with the information of the theme meaning contained in the polygon.
The hierarchical colormap of thematic information calculations for each polygon in the Voronoi region is shown in Figure 7.  It is also meaningful to study the aggregation of toponyms with respect to timeline. Building Voronoi polygons for each dynasty and period, the number of rural settlements and the calculated results of I G based on Equation (7) are shown in Table 7. The larger the area difference of each region, the more uneven the distribution of point position, which means, the richer the distribution form, so as the greater the information contained. The rural settlements in the Qing Dynasty accounted for 75.1% of the total rural settlements, and the geometrical distribution information contained in the Voronoi polygon was much larger than that in other dynasties and periods. Therefore, the distribution and aggregation characteristics of the Qing Dynasty were the most obvious indices and the overall trend of expansion. The geometrical distribution information contained in the Warlord conflict period, the Manchukuo Period, and the Eve of the Founding of PRC are similar to each other, indicating that the density of rural settlements in the region has tended to be stable at that time, but the mass migration was continuing due to the war and political policies. It can be seen that the calculation result of I G was the same as the KDE result.

Topological Adjacency Information
The degree of topology adjacency reflects the connectivity among elements, and the difference in adjacency reflects the difference in spatial connectivity. The calculation results of topology adjacency information entropy (denoted as I Topo ) based on Equation (8) are shown in Table 8. It can be seen that the adjacency of Mongol toponyms is the most obvious, followed by that of Manchu. Different from the calculated results of geometrical information distribution, the adjacency of the Korean is significantly different from that of the Hui and the Xibe, which is the same as the calculated results of the SDE.

Thematic Information
In this study, a constraint diagram was constructed for the toponymic points, and the Voronoi region corresponding to each point was obtained. For each polygon in the Voronoi region, the thematic information entropy contained in it was calculated according to Equation (11). The N j is the total number of polygon neighbors, and n j is the number of neighbors of the polygon's theme type.
Therefore, when all neighbors of a polygon are polygons of the same theme type, the thematic information is 0. When the N j value is the same, the smaller the n j value is, the more important the theme meaning contained in the polygon is, while the I j value is smaller, that is, the I j value of a polygon is negatively correlated with the information of the theme meaning contained in the polygon. The hierarchical colormap of thematic information calculations for each polygon in the Voronoi region is shown in Figure 7.
It can be clearly seen from the figure that the border area between Heilongjiang and Hulunbeier has a high thematic information entropy, and it can be inferred that this is the place where Mongolian and Manchu rural settlements meet, as well as the place where cultural integration occurs. On the contrary, the Inner Mongolia region, where Mongolian and Manchu gather, the wicker edge coastal region has low thematic significance; so, it can be inferred that they gather to a high degree and have low uncertainty.
In order to display the results of the thematic information more intuitively, all Voronoi polygons were counted according to the city area. The research area of this study was large, so it was feasible to calculate by city area. If the research area was small, it could also be calculated by county area or town area. By calculating the sum of the thematic information contained in all Voronoi polygons in each city, the diagram of layered color setting by the sum of the thematic information in the city map of the study area was shown in Figure 8. The greater the thematic information, the more obvious the phenomenon of multi-ethnic intermingling in this area, and the smaller the thematic information, the more obvious the phenomenon of single-ethnic settlement.
to calculate by city area. If the research area was small, it could also be calculated by county area or town area. By calculating the sum of the thematic information contained in all Voronoi polygons in each city, the diagram of layered color setting by the sum of the thematic information in the city map of the study area was shown in Figure 8. The greater the thematic information, the more obvious the phenomenon of multi-ethnic intermingling in this area, and the smaller the thematic information, the more obvious the phenomenon of single-ethnic settlement. If all the neighbors of a symbol are of the same theme type, then the symbol is of very little importance in the thematic significance. However, if a symbol has a neighborhood of different theme types, it should be considered to have higher thematic information. The calculated thematic information (denoted as ITh) results of all ethnic toponyms based on Equations (9) and (10) are shown in Table 9.  If all the neighbors of a symbol are of the same theme type, then the symbol is of very little importance in the thematic significance. However, if a symbol has a neighborhood of different theme types, it should be considered to have higher thematic information. The calculated thematic information (denoted as I Th ) results of all ethnic toponyms based on Equations (9) and (10) are shown in Table 9. The calculated results of geometrical distribution information and topological adjacency information were the largest for the Mongol, but the calculated results of thematic information show that the Manchu was the largest, so there were more residential areas of other nationalities around the rural residential areas of Manchu which had a higher thematic significance. The Mongol was so densely distributed that its thematic information decreased, while the Korean's thematic information increased.

Conclusions
Overall, the results demonstrated that, as a symbol language, toponyms have dynamically recorded changes in the human environment over its long historical development process and inherited the unique local historical culture. According to the spatial distribution of toponyms and the calculation results of spatial information entropy, it can be seen that Manchu occupied a dominant position among the ethnic groups in the study area, and the region with the largest kernel density was located in the eastern part of Liaoning Province and the southern part of Jilin Province, showing an obvious linear distribution. Mongolian is mainly distributed in Inner Mongolia, which is denser but less extensive than Manchu and has no obvious direction. The Korean is mainly distributed in the border region of Jilin, Liaoning, and North Korea, and its settlement scope is relatively small but also dense. As a kind of immigrant group, the Hui is scattered throughout the research area. The Xibe is a typical compact ethnic group in a small area, mainly distributed in Liaoning Province. According to the spatial distribution of rural settlements, the research results of rural settlement migration were consistent with the history of the study area. From the Ming Dynasty to the Warlord Conflict Period, there has been a continuous northward migration. After the Manchukuo Period and the Eve of the Founding of PRC, the rural settlement center began to return to the south, and there was a certain correlation between the density of toponyms and population density.
The results of this study are conducive to the protection and promotion of toponym culture and improves our understanding of the unique cultural information contained in toponyms. In the meantime, a large number of geographical data extracted from this study can be published on the China Historical Geographic Information System (CHGIS) to supplement the missing information in border areas and aid in solving problems of insufficient spatial coverage. Toponyms are repositories of the interaction between human history and natural change, with historical, philosophical, and scientific value. The toponyms remaining from previous dynasties should not be changed at will, and we should cherish and protect these ancient and meaningful toponyms. Toponyms are repositories of the interaction between human historical processes and natural changes, with historical, philosophical, and scientific values. In this study, the spatial information entropy is calculated based on the location information of toponyms, and the distribution and migration trajectory of ethnic groups and rural settlements can be inferred by combining the density and population information of toponyms. Of course, the distribution of topography, hydrology, and vegetation in the study area may also have certain influences on the distribution and migration of rural settlements. However, due to the space limitation and experiences, these parameters will be further investigated in the future. Moreover, this method can be applied to investigating a variety of spatial distribution phenomena and is likely to be applicable to other regions and ethnic groups, however we primarily encourage further study of this specific application.