Assessment of Heavy Metals in Agricultural Land: A Literature Review Based on Bibliometric Analysis

: A great amount of negative inﬂuence on human existence and environmental protection has been brought on by heavy metal pollution in agriculture soil. Thus, major awareness has been diverted to the evaluation of heavy metals (EHM) in agricultural land, which is used to improve the environment and ensure people’s health. Based on 3759 publications collected from the Web of Science Core Collection TM (WoS), this paper’s aim is to illustrate a comprehensive bibliometric run-through and visualization of the subject of EHM. Contingent on inﬂuential authors, top institutions, keywords are discussed in detail. Afterwards, the ruling publications and focal assemblage of EHM and leading publications are analyzed to discover the main research topics, according to citation analysis and reference co-citation analysis. The main motive of the paper is to assist research workers interested in the area of EHM determine the ongoing potential research opportunities and hotspots.


Introduction
The most basic material basis for human survival and development is land. Cultivated land is the necessary carrier for agricultural production. A certain amount and quality of cultivated land is the natural basis for crop production and the fundamental guarantee of food security. In recent years, under the historical background of urbanization, industrialization, and agricultural modernization, the trend of non-agricultural cultivation of land and agricultural labor force has intensified, the proportion of non-direct consumption of agricultural products has increased, which leads to the agricultural ecological problems, especially, the heavy metal contamination of farmland [1,2]. In the process of human daily life and industrial production, all kinds of wastes, including solid waste, waste gas and wastewater, which contain heavy metals that are difficult to degrade, are discharged. Heavy metals accumulate in organisms through contact or food chain, which not only threatens human beings, but also threatens ecosystem health [3,4]. A substantial number of papers have shown that heavy metal pollution in agricultural land majorly comes from human factors, including mineral mining, chemical smelting, garbage treatment, urban wastewater, vehicle exhaust emissions, pesticide and chemical fertilizer application, etc. [5,6]. The main way of human subjection to heavy metals in agricultural areas is through immersion of heavy metals by an integrated soil-crop system [7]. Predicated on the research framework of heavy metals, three key research directions are environmental characteristics, risk assessment, and remediation of soil polluted by heavy metal [8].

Data Source and Methods
A key role player in bibliometric research is literature data source. In this paper, the original literature data were obtained from the Web of Science Core Collection TM (WoS). Since 1900, about 18,000 high-quality journals in various fields and 1.3 billion citations have been collected in the database [22]. Detailed strategies for the literature search based on advanced search in WoS database are shown in Table 2.

Criteria Details
TS TS = (farmland * or "cultivated land" or "agricultural land" or "arable land" or cropland * or "agricultural soil *" or "grazing land *" or vineyard * or orchard * or plantation * or "paddy field *" or "dry land *" or "irrigated land *") AND TS = ("soil heavy metal *" or "heavy metal *" or "metallic pollution" or "toxic metal *" or "harmful metal *" or "hazardous metal *" or "heavy metal contaminant*" or "heavy metal pollutant *" or thallium or lead or copper or cadmium or chromium or mercury or arsenic) AND TS = (assessment or evaluation or estimation) 1 Languages "All language" Document types "All document types" Period "2005-2020" Database "Web of Science Core Collection TM " Asterisk (*) is a wild character that indicates any character group, including null characters.
Through the implementation of the search operation, 3902 publications were obtained. After excluding the literature in 2021 and sorting out the original data, including editorial material, book chapter, correction, and letter, 3759 publications highly relevant to the EHM field were selected from the Web of Science Core Collection TM on 28 March 2021. It can be learned from the collected data that these 3759 publications were academic articles, including literature reviews, which were only a summary of existing data, and

Data Source and Methods
A key role player in bibliometric research is literature data source. In this paper, the original literature data were obtained from the Web of Science Core Collection TM (WoS). Since 1900, about 18,000 high-quality journals in various fields and 1.3 billion citations have been collected in the database [22]. Detailed strategies for the literature search based on advanced search in WoS database are shown in Table 2. Table 2. Literature search strategy.

Criteria Details
TS TS = (farmland * or "cultivated land" or "agricultural land" or "arable land" or cropland * or "agricultural soil *" or "grazing land *" or vineyard * or orchard * or plantation * or "paddy field *" or "dry land *" or "irrigated land *") AND TS = ("soil heavy metal *" or "heavy metal *" or "metallic pollution" or "toxic metal *" or "harmful metal *" or "hazardous metal *" or "heavy metal contaminant*" or "heavy metal pollutant *" or thallium or lead or copper or cadmium or chromium or mercury or arsenic) AND TS = (assessment or evaluation or estimation) 1 Languages "All language" Document types "All document types" Period "2005-2020" Database "Web of Science Core Collection TM " Asterisk (*) is a wild character that indicates any character group, including null characters.
Through the implementation of the search operation, 3902 publications were obtained. After excluding the literature in 2021 and sorting out the original data, including editorial material, book chapter, correction, and letter, 3759 publications highly relevant to the EHM field were selected from the Web of Science Core Collection TM on 28 March 2021. It can be learned from the collected data that these 3759 publications were academic articles, including literature reviews, which were only a summary of existing data, and therefore would most certainly bring about a potential uproar in the research data and affect analysis accuracy. However, in other words, these literature reviews made great contributions to the promotion of knowledge exchange and the progress of the EHM field, so their role in the progress path of the EHM field cannot not be out of one's mind.
Then, some useful results about EHM domain were obtained by using several visualization tools. First, we used CiteSpace, which is a Java-based program developed by Chen [19] and happens to be widely used in bibliometric analysis in order to discover and display most recent trends and events in the specific sector. Within this research, CiteSpace was applied to disclose and visualize the disposition characteristics of the subject categories, congregate subject terms, and an itinerary prospect of the reference congregate. Second, in this extensive research, we used VOSviewer, which was collaboratively created by Van Eck and Waltman [23], is a free bibliometric mapping program that is used to present visual analysis of intercontinental communication, the co-operative network between key inventors, and keywords compactness visualization. Third, we used a software created by Garfield, Paris, and Stock [21], HistSite TM , which is a network analysis software that is used to visualize and examine citation correlation and calculate TCLS and total global citation score (TGCS). In addition, to demonstrate reputation and popularity references at the same time, Gephi was employed to evaluate the PageRank tally of every single one of the references in the cited network. Therefore, for additional research, the five congregates related to key co-cited references were discussed.

Present-Day Status of the Area of EHM
This section introduces the current basic situation of EHM field in precise detail, including the development trend, the specific sets of scholarly publications, core journals, and category.
The increase in the number of citations and publications in the EHM field from 2005-2020 is distinctly visible from Figure 2. The total number of publications shows an exponential growth trend (y = 5.0867e 0.1207x ; R 2 = 0.91). R 2 , with a value range between 0 and 1, is an indicator of fitting degree of trend line, which can reflect the fitting degree between the estimated value of trend line and the corresponding actual data. When the value is equal to or close to 1, the reliability is relatively high; on the contrary, when the value is close to 0, the reliability is lower. From 2005 to 2015, the number of citations of articles was in a state of fluctuating rise; from 2015 to 2018, it was in a high platform period; and then it decreased rapidly after 2018. In Figure 3, it can be seen that in 3759 publications, six types are illustrated, comprised of the meeting abstract, early access, data paper, proceedings paper, review, and article. It can be seen from the results that 3480 journal articles comprised 93% of the publications, and the other types of achievements only amounted to 7% of the total amount.
In accordance to the collected data, 871 journals have published scholarly publications related to EHM topics. The topmost 20 journals in the EHM is listed in Table 3. The outcome invariably indicates that about 36% of the scholarly publications were typically published in the top 20 journals and collections in EHM field. This undoubtedly means that about 36% of the publications are available in top 2% of the academic journals. Moreover, in the practical terms of the number of published articles, the Science of The Total Environment ranked first with 194 publications that have made great contributions in the EHM domain. In notable addition, the top five ranked journals, accounting for upwards of 16.89% of the total publications, were Science of The Total Environment, Environmental Science and Pollution Research, Environmental Monitoring and Assessment, Chemosphere, and Environmental Pollution. What can also be learned from Table 3 is the total local citation score (TLCS) and total global citation score (TGCS) of the topmost 20 journals in the EHM field. TLCS refers to the number of times a document is cited in a local data set (all documents that are exported after keyword search in WoS database). Therefore, if the TLCS value of a literature is very high, it means that it is an important literature in its research field, and it is likely to be a    research field, and it is likely to be a pioneering article in the field. TGCS refers to the total number of citations of a document in WoS database.

Number of Citation Number of Publications
publications, Science of The Total Environment and Environmental Pollution were ranked first and fourth, respectively, while TLCS and TGCS were both higher, indicating that the articles promulgated in Science of The Total Environment and Environmental Pollution made great contributions to the comprehension alteration and improvement in the EHM field. This paper also calculated the average citation times of each publication in each journal, dividing the data in the TLCS column by the data in count column (A_TLCS) and dividing the data in TGCS column by the data in count column (A_TGCS). The average number of TCCS and TGCS in some journals with a high number of papers was not high, including in Environmental Research and Public Health and Human and Ecological Risk Assessment, which indicated that scholars were concerned about authoritative journals and high-level journals in the field.
Through the co-occurrence analysis of subject categorization, disciplines associated to a specific knowledge field could be found effectively [24]. The co-occurrence network of specific categories related to the EHM area can be perceived from Figure 4 through the Pathfinder network scaling in CiteSpace. The dimensions of a circle (category) can be equally perceived from Figure 4 that the recurrence of the 3759 publications ostensibly symbolizes being sorted out into the identical category, and the co-occurrence connection of a certain publication represents every line divided into the two-subject classification. The top five categories in the area of EHM were Environmental Sciences and Ecology, Environmental Sciences, Agriculture, Engineering, and Water Resources. These published results indicate that EHM is an interdisciplinary research field, mainly conducted from the perspective of environmental sciences and ecology, environmental science, and agriculture. However, it can also be combined with some other research topics with great development potential, such as soil science and geology, for research.

Bibliometric Analysis
The role of this section is to demonstrate the key authors, key institutions, and keywords that have made valuable contributions in the field of EHM, including the collaboration network between key authors and the co-occurrence network of keywords.

Influence of Authors
It is not necessarily effective to determine the contribution of authors solely by analyzing the number of authors' publications. On the basis of Price's Law [25], approximately 75% of scholars release only one paper, half of which were published by one-tenth the research. As follows, it seems apparent that the identification of key authors who, apart from having a significant number of achievements, also make greater contributions to the development of the discipline, will aid us to better understand this field. Therefore, to identify the core authors in a specific field, Price's Law, which has been proposed and widely applied, can be used [26]. The minimum number of publications published by a scholar can be obtained by implementing Price's law, and the threshold number of publications formula is as follows: where the prime numeral of scholarly publications created by the most copious author in this academic field is accurately represented by Nmax. TPn represents threshold number of publications.
After carefully analyzing the necessary data, 14,349 authors in the EHM field have published articles. The most prolific author was Yong Li, who has published 35 publications. Therefore, the threshold for a core author to publish papers can be typically considered as 4.43, of which 291 authors can be considered as core authors. Collectively with Yong Li (35), including Jing Li (32), Lan Wang (26), and Yang Liu (23), these authors became the top five prolific authors.
In addition, what can be learned from Table 4

Bibliometric Analysis
The role of this section is to demonstrate the key authors, key institutions, and keywords that have made valuable contributions in the field of EHM, including the collaboration network between key authors and the co-occurrence network of keywords.

Influence of Authors
It is not necessarily effective to determine the contribution of authors solely by analyzing the number of authors' publications. On the basis of Price's Law [25], approximately 75% of scholars release only one paper, half of which were published by one-tenth the research. As follows, it seems apparent that the identification of key authors who, apart from having a significant number of achievements, also make greater contributions to the development of the discipline, will aid us to better understand this field. Therefore, to identify the core authors in a specific field, Price's Law, which has been proposed and widely applied, can be used [26]. The minimum number of publications published by a scholar can be obtained by implementing Price's law, and the threshold number of publications formula is as follows: where the prime numeral of scholarly publications created by the most copious author in this academic field is accurately represented by N max . TP n represents threshold number of publications.
After carefully analyzing the necessary data, 14,349 authors in the EHM field have published articles. The most prolific author was Yong Li, who has published 35 publications. Therefore, the threshold for a core author to publish papers can be typically considered as 4.43, of which 291 authors can be considered as core authors. Collectively with Yong Li (35), including Jing Li (32), Lan Wang (26), and Yang Liu (23), these authors became the top five prolific authors.
In addition, what can be learned from Table 4 is the key 20 distinguished authors based on their publications. According to an author's TLCS, the promotion of the development of the EHM field is mainly concentrated on the achievements of the top five authors, including Jianming Xu (144), Xingmei Liu (131), Jing Li (79), Yang Liu (74), and Bin Huang (71). From the viewpoint of the number of publications, although Xingmei Liu cannot be considered a fruitful author, her status in the EHM field cannot be underestimated. In addition, Jianming Xu and Xingmei Liu are two authors with more than 500 TGCS. From then on, it can be known that their work has also brought great inspiration to other disciplines. In addition, it can be seen from Table 4 that Jianming Xu is not an author with many achievements in the EHM field, but he is a very well-known researcher in the field of environmental science, ranking seventh based on his global citations. It can be seen from data that Jianming Xu has also made corresponding contributions to the dissemination of domain knowledge from EHM to other domains. In addition, alongside the participation of the VOSviewer tool, the collaboration among contrasting key researchers may be applied to identify research groups with great influence in the EHM field. It can be seen in Figure 5 that the collaboration network could be sorted into multiple sets with unalike colors according to the cooperation intensity. The circle's scale refers to the frequency of appearance of each author, and the lines connecting the circle and the circle represent the collaboration between the core researchers. It is also worth noting that there may be weak collaboration between the core authors in one group, and the authors may belong to other groups. For instance, in previous studies, Jing Li had more collaborations with the blue group key authors, while the cooperation with Yan Xu of the red group was weak.

Research Institutions Analysis
Based on the obtained data, as per the amount of publications, the author's organization information was extracted by using the HistCite TM . As shown in Table 5, the top 10 organizations in the EHM field were mainly concentrated in China. The top spot was occupied by the Chinese Academy of Sciences with the majority of published articles, amounting to 339 publications, succeeded by the University of Chinese Academy of Sciences (UCAS), Zhejiang University (ZU), China University of Geosciences (CUG), and Beijing Normal University (BNU), ranking in the top five.

Research Institutions Analysis
Based on the obtained data, as per the amount of publications, the author's organization information was extracted by using the HistCite TM . As shown in Table 5, the top 10 organizations in the EHM field were mainly concentrated in China. The top spot was occupied by the Chinese Academy of Sciences with the majority of published articles, amounting to 339 publications, succeeded by the University of Chinese Academy of Sciences (UCAS), Zhejiang University (ZU), China University of Geosciences (CUG), and Beijing Normal University (BNU), ranking in the top five.

Keyword Analysis
By using VOSviewer, 15,932 keywords were obtained and analyzed from the 3759 publications. Figure 6 shows the co-occurrence network of keywords associated with the EHM domain. By setting the minimum number of occurrences of a keyword as 10 in VOSviewer, 627 keywords were chosen from a total of 15,932 obtained keywords from the 3759 publications to discover further compelling results. The red key words focused on the management and control measures of heavy metals. Green key words focused on the enrichment of heavy metals. The key words in blue paid attention to the analysis of the spatial distribution and traceability of heavy metals. The yellow keywords focused on risk assessment of heavy metals. By analyzing keywords with CiteSpace's burstness detection function [27,28], 24 keywords with burst index were obtained. In Figure 7, the keywords are sorted in the horizontal direction according to the initial year of the outbreak. The left ordinate is the word frequency of the keywords, corresponding to the height of the histogram. The high and low stock market chart corresponds to the right ordinate, indicating the length of the outbreak cycle. The diameter of the circle where the key words are located indicates the size of its burst index, which is used to identify research topics that grow significantly or decline rapidly in a short period of time. As time goes by, research topics related to EHM are also changing. From the left to the right in Figure 7, the burst period of keywords is getting shorter and shorter. Model, copper, toxicity, zinc, metal, growth, management, and China are key words with more than 100-word frequency, which shows that scholars are very concerned about these aspects and are trying to explain the mechanism.

Keyword Analysis
By using VOSviewer, 15,932 keywords were obtained and analyzed from the 3759 publications. Figure 6 shows the co-occurrence network of keywords associated with the EHM domain. By setting the minimum number of occurrences of a keyword as 10 in VOSviewer, 627 keywords were chosen from a total of 15,932 obtained keywords from the 3759 publications to discover further compelling results. The red key words focused on the management and control measures of heavy metals. Green key words focused on the enrichment of heavy metals. The key words in blue paid attention to the analysis of the spatial distribution and traceability of heavy metals. The yellow keywords focused on risk assessment of heavy metals. By analyzing keywords with CiteSpace's burstness detection function [27,28], 24 keywords with burst index were obtained. In Figure 7, the keywords are sorted in the horizontal direction according to the initial year of the outbreak. The left ordinate is the word frequency of the keywords, corresponding to the height of the histogram. The high and low stock market chart corresponds to the right ordinate, indicating the length of the outbreak cycle. The diameter of the circle where the key words are located indicates the size of its burst index, which is used to identify research topics that grow significantly or decline rapidly in a short period of time. As time goes by, research topics related to EHM are also changing. From the left to the right in Figure 7, the burst period of keywords is getting shorter and shorter. Model, copper, toxicity, zinc, metal, growth, management, and China are key words with more than 100-word frequency, which shows that scholars are very concerned about these aspects and are trying to explain the mechanism.

Citation Analysis and Reference Co-Citation Analysis
This part conducts citation analysis and reference co-citation analysis based on the 3759 publications and their references and obtains analysis results through CiteSpace and HistCite TM .

Citation Analysis
Based on citation rate of occurrence in the 3759-node network, to ascertain the recognition of a publication [29], two indicators, TLCS and TGCS, are usually used in citation analysis. The TLCS refers to the situation where a publication was quoted by other people throughout the 3759-node network, whilst the TGCS represents the sum amount of citations incorporated from different directions and analysis fields. According to TLCS, the top 10 publications in the EHM field and their corresponding TGCS are shown in Table  6. Among the publications, the highest local citation was Wei and Yang (2010), which was published in Microchemical Journal and was written by Wei.  [39] 54 143 The significant purpose of citation analysis is the popularity of a scholarly publication in accordance with its quantity of quotations in the EHM field, overlooking a further main measure of "prestige", which is usually expressed by the number of times a publication is quoted by another highly quoted publications [40]. PageRank [41] is an efficient scientific ranking algorithm, applied to precisely calculate the notoriety and reputation of a node in the quoted network, and first appearing on Google's search engine

Citation Analysis and Reference Co-Citation Analysis
This part conducts citation analysis and reference co-citation analysis based on the 3759 publications and their references and obtains analysis results through CiteSpace and HistCite TM .

Citation Analysis
Based on citation rate of occurrence in the 3759-node network, to ascertain the recognition of a publication [29], two indicators, TLCS and TGCS, are usually used in citation analysis. The TLCS refers to the situation where a publication was quoted by other people throughout the 3759-node network, whilst the TGCS represents the sum amount of citations incorporated from different directions and analysis fields. According to TLCS, the top 10 publications in the EHM field and their corresponding TGCS are shown in Table 6. Among the publications, the highest local citation was Wei and Yang (2010), which was published in Microchemical Journal and was written by Wei. The significant purpose of citation analysis is the popularity of a scholarly publication in accordance with its quantity of quotations in the EHM field, overlooking a further main measure of "prestige", which is usually expressed by the number of times a publication is quoted by another highly quoted publications [40]. PageRank [41] is an efficient scientific ranking algorithm, applied to precisely calculate the notoriety and reputation of a node in the quoted network, and first appearing on Google's search engine to push more extensive quality websites to potential consumers. The feasibility of implementing the PageRank algorithm to find publications with a higher notoriety and reputation could be considered due to the web-page network between websites being in line with the citation network of 3759 scholarly publications [42,43]. In the network of 3759-nodes, paper A is assumed to be quoted by other publications T 1 . T n , then the following equation can be typically used to correctly calculate PageRank score PR(A) of paper A [41]: Among them, the damping factor (0,1) is represented by d, which typically relates to a tiny number of arbitrary walks that pursue to promulgate with the citations [42,43]; N refers to the total number of the publications in the network; and C(T i ) can be marked as the number of citations of the publications T i . It is critical to accurately note that the damping factor ought to be set to 0.85, which is consistent with the conventional PageRank algorithm [44], and the total amount of the PageRank amount is equivalent to 1. Moreover, a repetitive algorithm, precisely relative to the principal eigenvector of the normalized citation matrix of the publications, can be implemented to accurately calculate the PageRank scores [43].
As shown in Table 7, the top 10 publications were unanimously selected by PageRank scores calculated by Gephi. The top three publications ranked by PageRank scores are almost in line with publications based on TLCS. The PageRank algorithm is implemented to select papers with significant popularity and enduring popularity in the EHM field. Since the TLCS of Yang (2019) [44] and Wan (2019) [45] were only 8 and 3, respectively, it showed they were not very popular in the EHM field. However, graciously, according to their PageRank scores, they can indeed be recorded among the top ten publications.

Reference Co-Citation Analysis
A co-citation map constitutes of a group of edges and a group of nodes (references) to represent the co-occurrence of nodes in the reference list of papers of that map [49]. Therefore, the meaning of reference co-citation is that two documents appear in the reference list of another publication at the same time. Reference co-citation analysis carries out a significant part of identifying the betterment and progression of a specified section [50]. Therefore, the reference co-citation network in the EHM field was constructed by CiteSpace, and five cluster groups were obtained. Potential research opportunities were carefully analyzed from each cluster category.
The first cluster is risk assessment (#0). Heavy metal pollution in agriculture land has been reported in much research to have a great negative impact on food production, the food chain, and ecological environment health [51]. Therefore, it is an important research topic to quantitatively evaluate the ecological risk in agriculture land and take measures to control its ecological risk. Risk assessment is an important analytical method, which is essentially a function of hazard and exposure [52], and can be defined as the process of estimating the probability of adverse health effects of any given degree in a specific time. The results of the calculation enable decision-makers to treat contaminated sites in a cost-effective way while protecting public interests and the ecological environment [53]. Many scholars have carried out various studies to assess the potential ecological risks associated with heavy metals, including single pollution index [54], multivariate statistical analysis (MSA) [55], risk assessment code (RAC) [56], geoaccumulation index (Igeo) [57], enrichment factor (EF) [58], contamination factor (CF) [51] and ecological risks index (RI) [53]. Moreover, the physicochemical properties of soil, including water content, trace elements, pH, clay specific gravity, and other indicators, will affect the adsorption and the enrichment of heavy metals with plants and organisms through the biological chain, which will also have a negative impact on the local health risk. Therefore, it is necessary to further study these aspects of heavy metals in agriculture land and their effects on human and environmental health [53].
The second cluster is source apportionment (#1). Soil is mainly exposed to heavy metals from natural sources and human factors, mainly from automobile exhaust, atmospheric deposition, livestock breeding, fertilizer and pesticide application, waste discharge, and sewage irrigation [59]. Man-made factors and natural factors will cause deposition of heavy metal contents in soil far beyond what nature can digest. In order to develop and apply reasonable and efficient heavy metal control measures and reduce the man-made emission of toxic metals in agricultural land, the source apportionment of heavy metal has become a hot spot and focus of research [58]. Principal component analysis (PCA), cluster analysis (CA), and correlation analysis are usually used to qualitatively determine the source of heavy metals in soils [60]. Receptor models, such as positive matrix factorization (PMF) and chemical mass balance are also used for quantitative analysis of heavy metal sources [61]. Traditional methods (such as PCA and CA) have limited ability to deal with the complex relationship between soil heavy metals and related variables, which has become the main limiting factor of soil heavy metal source identification [62]. In order to accurately interpret the complex relationship between target variables and environmental variables, machine learning methods are developed and applied to the field of soil pollutant source identification, including the classification regression tree (CART) and random forest (RF), which can not only deal with the linear relationship between soil heavy metals and environmental conditions, but also deal with the nonlinear and hierarchical relationship, indicating that they are superior to traditional methods (such as PCA and CA) in the field of soil pollution source identification [63]. Categorical regression analysis [64], in practical application, is particularly effective in analyzing the identification of influencing factors of soil heavy metal when there are ordinal variables, numerical variables, and nominal variables. Although this method is not yet widely used in this field, it can provide another perspective and technical support to find sources that are responsible for the heavy metal pollution. Understanding the sources of heavy metals in agriculture land will help to provide effective support for heavy metal prevention and control [63].
The third cluster is spatial variability (#3). More and more evidence show that it is critical to determine the spatial distribution of heavy metal pollutants, which helps scientists to identify high-risk areas and helps decision makers to determine where remediation measures should be concentrated [65]. One of the spatial distribution characteristics of heavy metal pollutants in soil is its high spatial heterogeneity, which can be mapped based on GIS (Geographic Information System) and Geostatistics [66]. The most important feature of Geostatistics is the unbiased estimation of variable values for spatial objects in an unsampled area [67]. In the study of the regional soil quality investigation based on GIS and multivariate statistical analysis [64,67,68], the two most commonly used approaches, Kriging and inverse-distance weighted (IDW), are based on the weight value assigned to sample values in a nearby location to calculate soil property values [65]. The IDW is one of the most commonly used spatial interpolation methods due to its fast execution speed, ease of use, and simple interpretation. The geostatistical technique of Kriging, which originated from the regionalized variable theory and was first introduced to the GIS field in the 1990s, is the most widely used interpolation method and is based on the random spatial change model [69]. It can be used to estimate confidence intervals for derived values at unsampled locations [65]. Compared with Kriging, IDW is easier and faster to implement than Kriging, but Kriging can also provide useful information about data spatial structure and estimation error distribution that IDW cannot provide [70].
The fourth cluster is health risk assessment (#4). The health risk assessment of each potentially toxic metal is usually based on the quantification of the risk level, which can be expressed as carcinogenic or noncarcinogenic health risks [52]. Many researches indicate heavy chemical industries not only lead to enrichment of heavy metal in agriculture land, but also significantly increase the risk of cancer in surrounding residents and affect the birth of infants [71][72][73]. Humans can ingest metals directly from topsoil or indirectly through food processing, leading to nervous or digestive system disorders and carcinogenesis. Therefore, heavy metal pollution poses a greater threat to human health, especially to children, who are more vulnerable than adults [31,74,75]. The potential health risk of heavy metals in soils and crops mainly comes from two sources: long-term exposure to contaminated soil particles and eating of polluted crops [75]. The potential health risk can be expressed by target hazard quotients (HQ) and hazard index (HI) recommended by USEPA [76]. Based on the explanation of USEPA, when HI > 1, non-carcinogenic effects may occur, while HI < 1, contacts are unlikely to experience obvious adverse health effects [76].
The fifth cluster is using reflectance spectroscopy (#11). Soil contaminated by heavy metal will have adverse effects on the growth and reproduction of organisms so it is very important to monitor and identify the content of heavy metal in agriculture land for soil remediation and restoration [77]. The current analytical methods, including chromatography and atomic spectrometry, have been used for high-precision analysis of heavy metals in agriculture land, which is considered to be high-cost and time-consuming [78]. Compared with traditional heavy metal analysis methods, visible and near-infrared reflectance spectroscopy (VNIRS), which is relatively fast and cost-effective, has been used to detect and estimate the content of heavy metals in soil [79]. Moreover, VNIRS can be deployed in the field or in the laboratory for the analysis of heavy metal [80]. With the great improvement of the sensing ability of visible and near infrared spectral imaging equipment, more accurate target detection and recognition can be realized, which can capture the complex optical characteristics of heavy metals in soil and estimate the concentration of heavy metals in soil [81]. Even though very detailed spectral information can be obtained from high resolution, the numerous spectral bands may lead to optical complexity, which contains other information that is unrelated to heavy metal elements [82]. The deep learning method shows the most advanced performance in image-based data processing, which can reduce the complexity and dimension of data. These characteristics have great potential in the analysis of soil heavy metals by visible light and near infrared spectroscopy [83].

Discussion
Based on bibliometric analysis, present day analysis gaps and probable subsequent research occasions in the EHM field are addressed. For those who are interested in the field of EHM, it might be beneficial to grasp the emerging research fields in future research. The EHM study categories, which are interdisciplinary (see Figure 4), were first and foremost concentrated on the top five categories (i.e., environmental sciences and ecology, environmental sciences, agriculture, engineering, water resources). Therefore, interdisciplinary research can be promoted by further expanding the diversity categories. For instance, we see from Figure 4 that, although public, environmental, and occupational health are marginalized in the EHM field, it still delivers a significant impact on people's daily life and future well-being. Therefore, when studying EHM, public health might be a promising research direction. For example, human health will be affected through the food chain by heavy metal pollution, precipitating a significant environmental crisis. However, in recent years, only a limited number of publications have explored the path by which heavy metals affect human health [84]. Some articles have been published on the negative effects of heavy metals on human reproductive health and possible prevention strategies and show that the decline of male fertility is closely related to environmental pollution caused by heavy metals [85,86]. Therefore, the most important part of studying how to more efficiently deal with the health crisis caused by heavy metals is to acknowledge the influence and transmission route of heavy metals on human beings in practice of pollution control and prevention. In addition, some of the missing categories that are not found in the Figure 4 categories of co-occurrence network could be studied hereafter, such as sustainable science. The long-term predatory exploitation of the natural environment has caused many major environmental disasters, such as the heavy metal pollution of soil. Therefore, finding the balance between economic development and environmental protection will be an important direction of future research.
The more mobile soil particles, which include street dust deposited by materials such as building materials, vehicle exhaust, and industrial emissions of air particles, pose a particular threat to us because of their mobility as compared to normal soil particles and other kinds of sinks [87,88]. It appears that street dust has captured an all-embracing awareness in the EHM field [89][90][91][92]. However, street dust is a small circle, of no value to researchers, as judged by the co-occurrence network of keywords in Figure 6 (at the bottom right). Therefore, there is still substantial space for development in EHM research. For example, although some scholars have been investigating the spatial distribution and pollution status and source [93,94], the scientific literature, which was related to the concentration of trace metals in non-rural street dust or soil, has focused on several metals, such as Cr, Cu, Pb, Zn, and Ni, and a city-wide systematic analysis was uncompleted [95]. Presently, almost all studies on EHM contamination of street dust mainly focus on big, developed cities and rarely on resource cities [96]. Therefore, in the future, scholars can expand the scope and scale of the research area, especially in the urban-rural fringe and villages close to the city and major transportation arteries.
Some indices are widely used in soil pollution assessment. Recently, certain indices have explored how to evaluate soil pollution, including enrichment factors [96,97], geo accumulation index [98][99][100], pollution load index [101][102][103], and toxic risk index [104][105][106]. However, in the field of EHM, certain indices remain in need of refinement. For instance, although enrichment factors were amply employed in numerous fields, Figure 6 (right hand side with blue color) exhibits that enrichment factors have experienced hardly any awareness in the field of EHM. As follows, certain indices are crucially important for assessment of heavy metals. Subsequent analysis could pay attention to investigating how to outline methodical and accurate implementations to evaluate heavy metal pollution in soil. For example, artificial intelligence has a strict mathematical foundation, good generalization ability, and the specific implementation process is simple, which reduces the impact of human subjective judgment on the results, and the evaluation results are more in order with the genuine circumstances. In the future, the evaluation work will transition from single evaluation to comprehensive or systematic evaluation, and from pollution degree to environmental risk-human health risk, therefore, promoting the development of soil pollution evaluation.
More and more undertakings have not only evaluated soil pollution in recent years, but have also been equally paying increasing awareness to the identification of the origin of the source and environmental estimation of the risk. A vast adverse effect to man and nature on harmonious progress is brought on by heavy metal pollution in the soil, and, in addition, we should not forget about the environmental, social, and economic aspects. Some studies have considered economic and social sustainability in the EHM research [107]. However, as shown in Figure 8, the risk assessment appears in the reference cluster and receives much attention. In addition, in today's society that emphasizes sustainable development, environmental protection will fulfil a crucial role in future development, especially to achieve the harmonious coexistence of economic development and the natural environment. Therefore, an ecological perspective about evaluating pollution risk should be considered for future research and practical application of soil pollution control. From the perspective of sustainability, the (USEPA) Environmental Protection Agency recommended health risk assessment method might be a fitting method to discover more managerial measures for pollution control. [107]. However, as shown in Figure 8, the risk assessment appears in the reference cluster and receives much attention. In addition, in today's society that emphasizes sustainable development, environmental protection will fulfil a crucial role in future development, especially to achieve the harmonious coexistence of economic development and the natural environment. Therefore, an ecological perspective about evaluating pollution risk should be considered for future research and practical application of soil pollution control. From the perspective of sustainability, the (USEPA) Environmental Protection Agency recommended health risk assessment method might be a fitting method to discover more managerial measures for pollution control.

Conclusions
The purpose of this paper was to introduce the progress path, the research hotspots, and potential research directions in the EHM field formulated on the bibliometric method. Therefore, initial data were gathered together from the Web of Science Core Collection TM (WoS), and 3759 related publications were acquired following the data cleaning. In addition, further courses in relation to the present status, inclusive of publication sorts, foremost journals, and co-occurrence network of categories, were carried out. Moreover, additional practical outcomes, bibliometric analysis, citation analysis, and co-citation analysis were applied.
On basis of the bibliometric analysis, key deductions are in accordance: (1) Amount of publications in the EHM domain has increased gradually in the time of 2005-2020; (2) 871 journals and proceedings incorporated publications about EHM, in which about 36% of the publications were published in 2% of the journals, and Science of the total Environment was ranked in top place as per the amount of published articles; (3) According to Price's law, 291 authors were core authors, while 14,349 authors were involved. Moreover, through citation analysis, on the basis of TLCS and PageRank scores, the top 10 publications were selected. (4) Reference co-citation analysis obtained five categories according to 145,053 references, and the literature of every aggregate may point out prospective directions for subsequent analysis in the EHM domain. In other words, the articles in Cluster 1 (risk assessment) can be further expanded to evaluate the situation

Conclusions
The purpose of this paper was to introduce the progress path, the research hotspots, and potential research directions in the EHM field formulated on the bibliometric method. Therefore, initial data were gathered together from the Web of Science Core Collection TM (WoS), and 3759 related publications were acquired following the data cleaning. In addition, further courses in relation to the present status, inclusive of publication sorts, foremost journals, and co-occurrence network of categories, were carried out. Moreover, additional practical outcomes, bibliometric analysis, citation analysis, and co-citation analysis were applied.
On basis of the bibliometric analysis, key deductions are in accordance: (1) Amount of publications in the EHM domain has increased gradually in the time of 2005-2020; (2) 871 journals and proceedings incorporated publications about EHM, in which about 36% of the publications were published in 2% of the journals, and Science of the total Environment was ranked in top place as per the amount of published articles; (3) According to Price's law, 291 authors were core authors, while 14,349 authors were involved. Moreover, through citation analysis, on the basis of TLCS and PageRank scores, the top 10 publications were selected. (4) Reference co-citation analysis obtained five categories according to 145,053 references, and the literature of every aggregate may point out prospective directions for subsequent analysis in the EHM domain. In other words, the articles in Cluster 1 (risk assessment) can be further expanded to evaluate the situation of heavy metal pollution and provide feasible suggestions for heavy metal pollution control. The academic achievements in Cluster 2 (source apportionment) can be further extended to promote environmental protection by quantitative analysis of heavy metal pollution sources and reducing heavy metal emissions caused by human factors, including the amount of fertilizer, pesticides, and fungicides. In Cluster 3 (spatial variability), some methods can be utilized for analyzing the spatial distribution of heavy metals. According to the literature of Cluster 4 (health risk assessment), eating food crops polluted by heavy metals is the main food chain approach for human beings to have contact with heavy metals. Finally, on the basis of the Cluster 5 (using reflectance spectroscopy), in the stream of literature, a more time-saving and faster method was used to measure the content of heavy metals in soil.
However, in future research, there are still some restrictions needing to be resolved. The data comes from the Web of Science Core Collection TM (WoS), which might lead to bias in bibliometric analysis outcomes. Data sources and database ranges can be expanded to incorporate additional publications related to the EHM field, such as CNKI (China National Knowledge Infrastructure), ProQuest papers, and Google Scholar. In addition, although, based on bibliometric analysis, objective results in the EHM field can be obtained, some root reasons of these results are reasonably unexplained. In the future, research methods commonly used in social sciences, such as face-to-face interviews and field surveys, can be used to contribute to the problem solving.

Institutional Review Board Statement:
Not applicable for studies not involving humans or animals.

Informed Consent Statement:
Not applicable for studies not involving humans or animals.
Data Availability Statement: All data included in this study are available upon request by contact with the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.