Sustainability, Big Data and Mathematical Techniques: A Bibliometric Review

: This article has reviewed international research, up to the ﬁrst half of 2021, focused on sustainability, big data and the mathematical techniques used for its analysis. In addition, a study of the spatial component (city, region, nation and beyond) of the works has been carried out and an analysis has been made of which Sustainable Development Goals (SDGs) have received the most attention. A bibliometric analysis and a fractal cluster analysis were performed on the papers published in the Web of Science. The results show a continuous increase in the number of published articles and citations over the whole period, demonstrating a growing interest in this topic. China, the United States and India are the most productive countries and there are more papers at the regional level. It has been found that the environmental dimension is the most studied and the least studied is the social dimension. The mathematical techniques used in the empirical work are mainly regression analysis, neural networks and multi-criteria decision methods. SDG9 and SDG11 are the most worked on. The trend shows a convergence in recent years towards big data applied to supply chains, Industry 4.0 and the achievement of sustainable cities.


Introduction
Big data is a fundamental tool in many areas. The collection, storage and mining of large datasets for information can create significant value for the global economy, as well as improve the productivity and competitiveness of both businesses and the public sector [1]. The use of big data makes it possible to collect, store and analyse a large amount of data in order to establish patterns and obtain estimators that can make real-time decisions about a given target [2,3], improving its performance and obtaining benefits [4].
The use of big data is on the rise as more and more organisations have to handle large databases and because of the availability and access to data due to globalisation [5]. Big data enables many companies to gain a greater competitive advantage over their competitors [6][7][8] as well as to solve specific organisational management problems [9]. Big data is not only used in companies and organisations, but increasingly also in the public sector [10][11][12], as it makes it possible to address many issues such as efficiency, productivity or transparency, and to solve problems arising from this [13].
In recent years, climate change, environmental degradation or resource depletion have become growing problems in society [14,15]. Therefore, governmental institutions and many collectives have mobilised to protect the environment and create a more sustainable world [16,17]. In 2015, the United Nations adopted the 2030 Agenda, which enacted the 17 Sustainable Development Goals (SDGs), aiming to achieve a cleaner, fairer and more sustainable world by 2030 [18].
With the emergence of big data, the aim is, through data analysis, to anticipate events and prevent environmental catastrophes or reduce environmental degradation, while at the same time achieving sustainable development.
Many companies and institutions use sustainability indicators to make decisions, as they can be a barometer of socio-economic conditions to monitor different aspects of global risk [19]. In addition, companies gain insight into market risks and opportunities by assessing different sustainability indicators [20]. These indicators can serve as an assessment tool to diagnose risks and reduce complexities, although their application is not easy [21].
To measure sustainable development, it is necessary to quantify the phenomena that represent such progress through indicators [22]. These include simple ones such as Gross Domestic Product (GDP), which measures a country's economic development, and more complex ones such as immunisation against infectious diseases. The first set of indicators was published in 1996 in the form of a Driving-Force State-Response framework by the United Nations [23] to assess economic, social and environmental progress and provide the necessary information [24]. These indicators, which contribute to the achievement of the SDGs, can show the current state and evolution of the environment, as well as major concerns, which can help policy-makers to make important decisions based on the information provided [25]. The results measured through sustainable development indicators can help in the development of successful policies that lead to the reduction of environmental problems [26,27]. In this regard, Huovila et al. [28] conducted a comparative analysis of sustainability indicators for smart sustainable cities and concluded that the selection of the most appropriate indicators depends on factors such as the development stage of the city, the time scale of the assessment or the spatial scale. A growing body of work using big data and different mathematical models aims to establish ways to achieve sustainability [29][30][31][32].
The aim of this research is to carry out a review of the scientific production on the use of big data in sustainability using some kind of indicator to measure it, in order to show and analyse the intellectual, conceptual and social structures of the research, as well as its evolution and dynamic aspects.
Systematic literature reviews are very useful for researchers because they help them to go deeper into the field of study, to learn about trends, networks, works, authors with the most citations and to ask questions for future research. To these advantages can be added some risks such as "inflated" citations due to the abuse of self-citations, the need to define precise semantic search fields, etc. [33]. Bibliometric studies contemplate a formal and rigorous procedure that guarantees the quality of the results obtained by using increasingly sophisticated analysis techniques and software and increasingly systematic and complete compilations of scientific publications [34].
By means of bibliometric analysis, using the WoS (Web of Science) database, scientific production was quantified, and its impact was measured, describing how certain fields of research are related and how they evolve over time. In addition, the works were classified according to their empirical or conceptual approach. In the case of an empirical paper, the country where the practical application of the work took place was obtained and grouped into three different levels, micro, meso and macro [35,36], depending on the spatial scope of the document. The mathematical technique used and the SDG worked on were also analysed, as well as the type of sustainability indicator used. Specifically, the research questions were posed in the following terms: Q1: What are the main trends in scientific production on sustainability based on big data? Q2: Which SDGs do the papers relate to? Q3: At what spatial level were the empirical papers conducted? Q4: What kind of mathematical techniques are applied?

Materials and Methods
The type of research carried out in this work is exploratory, descriptive and quantitative based on the techniques and tools of bibliometric analysis of the documents stored in the WoS bibliographic database. This is one of the world's most important databases of bibliographic references and citations of periodical publications. Its purpose is not to offer the text or abstracts (although they can be consulted) but to provide analysis tools that allow the scientific quality of publications to be assessed. One of the metrics it offers is Journal Citation Reports (JCR), the best-known quality indicator and one that is also evaluated by organizations that assess research activity. In WoS, only the highest-quality journals in each field are indexed [37].
The review process followed several distinct stages: 1.
Time scale: as the importance afforded to sustainability has mainly occurred in the last decades, it was decided not to include time restrictions, obtaining all the documents up to 30 June 2021.

2.
In the search of the existing literature in the WoS database, three words were always used, taking as a basis: "sustainab* and big data" to which were added: "indicator", "index", "assessment", "evaluation", "measurement". Finally, the results found in each of them were joined according to the following sentence: "sustainab* and big data and indicator" OR "sustainab* and big data and index" OR "sustainab* and big data and assessment" OR "sustainab* and big data and evaluation" OR "sustainab* and big data and measuring". 3.
The required document type was article, also including review article. The result was n = 507 documents.

4.
A preliminary reading of the abstract was initially carried out, followed by a more exhaustive reading of each article to verify that it met the established inclusion criteria.
From the search carried out, we have eliminated those works (182 publications) that use some combination of the key words in a tangential way, without giving them the strict sense that our research pursues. For example, papers that include the word sustainability, but have nothing to do with mathematical techniques for sustainability, have been discarded. After this filtering, n = 325 documents were obtained, of which 34 are bibliometric studies ( Figure 1). The distribution of keywords in the 325 papers analysed is shown in Figure 2. The most prominent keyword is big data, followed by sustainability, which is compatible with the search query. A new register was created to integrate the spatial dimension (micro, meso and macro) in the empirical documents: • The denomination "micro-level" refers to the fact that the analysed document performs its empirical application in companies or places located in a small territory. • The term "meso-level" has been used when the work has been carried out in several places at the same time or in industrial parks. • The term "macro" refers to the fact that the document has a global or country-wide application.
It was also recorded which SDGs are being addressed, the type of indicator used (environmental, economic or social) and the mathematical techniques used. From the database completed with the new variables, we proceeded to answer the research objective of this work. This process has been divided into several phases leading to the analysis of how sustainability is being worked from big data and the mathematical techniques used for this purpose.

Results
The review presents a combined perspective of a bibliometric analysis together with the spatial level analysed (meso, micro, macro) and the SDG considered.

Number of Publications per Year
The number of publications per year is shown in Table 1. The first work using big data with sustainability indicators dates from 2015. The number of papers has been growing from 7 in 2015 to 101 in 2020, which fits (see Figure 3) an exponential function (R 2 = 0.9761). In the first half of 2021, 61 documents with the above requirements have been found and it is expected to continue at this rate, also following the exponential growth.  Over the study period, two distinct periods can be observed in the average number of citations per year. The first, until 2019, with exponential growth, and then a significant decrease until the middle of 2021 (the latest information collected is in June 2021).

Distribution of Publications by Knowledge Area
The distribution by areas of knowledge shows a concentration on topics related to environmental sciences, technology, engineering, computer science and economics. Table 2 lists the top 10 most represented. Environmental Sciences Ecology stands out with 142 papers and Science Technology Other Topics with 121. It is evident that an article may be associated with more than one area of knowledge, so that there are overlaps between these areas. A measure of the overlap between each pair of knowledge areas has been calculated from the top 10 most used. This index is defined as the quotient between the publications included simultaneously in both areas divided by the publications belonging to at least one of them. The higher the value of the indicator, the more publications in common between the two areas of knowledge. Figure 5 shows the value of the overlap measure for those areas that have some publications in common, among the 10 included in Table 2. The areas of Remote Sensing and Construction Building Technology share 12 papers, with 15 and 12 publications respectively, in each of them, so their overlap is high. Energy Fuels with 21 publications and Construction Building Technology with 12, as well as Environmental Sciences and Ecology (with 142 publications) and Science Technology Other Topics (with 121) also have many publications in common, with the indicator of greater than 0.5 in both cases. However, Science Technology Other Topics and Computer Science share only 1 publication.  Figure 6 shows the top 10 institutions, according to WoS, with the highest number of articles published. The institutions that provide the most researchers are located in Peoples R China, with Dalian University of Technology standing out with 15 publications, followed by the Chinese University of Hong Kong with 12 and Peking University, the University Chinese Academy of Sciences and the University of Hong Kong with 11. Only one is located in Spain, the Polytechnic University of Valencia, with 10 publications, the same as the Central China Normal University. The Dalian University of Technology, which ranks first in this study, focuses on studies in the fields of science, engineering, technology, management science and economics. The Chinese University of Hong Kong, founded in 1963, is the oldest university institution in Hong Kong and ranks high in Science, Dentistry, Medicine, Engineering and Social Sciences in general. Peking University is known for emphasizing the teaching and research of the basic sciences and is especially focused on research. The fourth-ranked Chinese Academy of Sciences is a research centre focusing on fields such as environmental research, chemistry, engineering and computer science. Figure 7 shows, in different shades of blue, the countries in which there are scientific researchers on this topic. Peoples R China stands out with 466 authors, followed by the United States with 144, the UK with 88 and India with 86. Of the 325 manuscripts analysed, observing the author of correspondence of each one of them, 30.5% are researchers from Peoples R China, 8.9% from USA, 6.1% from India and 5.2% from Spain. The rest belong to 45 different countries.

Productivity of Authors
Only 5.5% of the papers are signed by a single author (Figure 9), with three or four authors being the most frequent (almost 25% and 20.6%, respectively), and 5.2% of the documents are signed by more than seven authors. The average number of authors per article is 4.14. If the 17 articles with more than 7 authors are not taken into account, the average is 3.6. This value of the associativity index indicates that there is a tendency towards group work when it comes to publishing the results of research in the subject analysed.
Among the listed articles, 1263 different authors have been identified with a range of publications between 1 and 7 articles. The percentage of authors with only one article, the so-called transience index or percentage of occasional authors, is over 90%. The author with the highest number of publications is Ming-Lang Tseng from the Taiwan University of Wufong in Taiwan, with articles on sustainable supply chain management.
To study the behaviour of author productivity, we used Lotka's law of bibliometric quantification, which determines the productivity of authors based on a discrete probability distribution [59,60]. This law states that the number of authors publishing "n" papers is inversely proportional to the number of papers squared, A n = A 1 /n 2 , where A n is the number of authors signing n papers. In this way, it becomes known which are the elite authors in a discipline [61]. Figure 10 shows the frequency distribution of scientific productivity, showing that the level of productivity of the authors in this subject is small. As the simple count of publications of each researcher is an unreliable measure to assess their contribution to the scientific world [62], other indicators such as the fractional index (FI), the total number of citations received by the papers, the citations per article and the average number of citations per year are shown in Table 3. The fractional index takes into account the number of co-authors of each paper (the total FI per paper is 1, which is shared among all authors under the assumption that each contributed equally). Of the authors with the most papers, it is still Ming-Lang Tseng who has the highest fractional index (1.4), although Rick Edgeman has an index of 1.50 which does not appear in Table 3 because he has only two publications.
The number of citations received by authors is an indicator of the impact of their work in the scientific community, although this not only depends on its quality, but also on other factors such as the prestige of the author, the topicality of the subject or the journal of publication, which will condition the dissemination of the work. Donald Huisingh (with three articles) is the author with the most citations, more citations per article and per year. Ming-Lang Tseng occupies the second position in the total number of citations with 7 documents and is in lower positions in citations per article and per year (his first publication in this selected database is from 2015).
In order to simultaneously take into account the number of papers, total citations and citations per paper, the h-index has been calculated, which is not affected by having an excess of uncited papers or papers with many citations [63]. Ming-Lang Tseng stands out with an index equal to 6 and almost all the other authors in the top 10 have an index of 3 or 4. To distinguish between authors with a similar h-index, the g-index has been calculated, which is the order number of the position in which the number of accumulated citations is equal to or greater than the squared position number. This gives value to highly cited articles that would not have been recognized by the h-index because "once a paper belongs to the top h papers, its subsequent citations no longer 'count'" [64]. With the exception of Ming-Lang Tseng with an index equal to 7, the other authors have the same relevance. To correct for the fact that researchers who have published later are disadvantaged in the calculation of the h-index, the m-index has been calculated. This is the result of dividing the h-index by the number of years that have passed since their first publication. In this case, Sachin S. Kamble (m-index = 1.5) is the most productive, followed by Ming-Lsang Tseng.
However, the paper by El-Kassar and Singh [65] has the highest number of citations (127), followed by Yigitcanla et al. [66], which reached 114, being also the ones with the highest average number of citations in the period analysed ( Table 4). The first paper presents a holistic model that shows the relationship between green innovation and the factors that make it possible to cope with technological advances and increase business performance. The second paper reviews the literature on whether smart cities can be achieved without becoming fully sustainable. Table 4. Ranking of the 10 most cited papers.

Artículo Citations Average
Green innovation and organizational performance: The influence of big data and the moderating role of management commitment and HR practices [ In the study of direct correlations between the authors listed in this database, a group of seven authors stands out (Figure 11), who assess the sustainability of various business activities, focusing on efficiency. The group listed in green studies the sus-tainability of 4.0 companies and the blue group studies sustainable agriculture supply chain performance. The analysis of co-authorships between authors again shows three clusters ( Figure 12). The smaller the distance between two authors in the visualisation, the stronger their relationship. Stronger links are also represented by thicker lines.

Distribution by Journal
The papers are published in 166 different journals, and the 10 journals that publish the most account for more than 40% of the papers. Table 5 shows the ranking of the journals with at least 3 articles indexed in the WoS database, the area in which each of them works, the impact factor in 2020 and the quartile in which they are included. Most of these journals belong to the publishers Elsevier and MDPI.
The distribution by journals shows that Sustainability has the greatest number of articles published (58), grouping its papers in the areas of Environmental Studies, Green and Sustainable Science and Technology and Environmental Sciences. Its impact factor in 2020 is 3.251 and it is in the second quartile. It is followed by Journal of Cleaner Production (32) and Sustainable Cities and Society (10), which are located in the first quartile of their areas and their impact factors are, respectively, 7.246 and 7.587.
The impact factor in 2020 of the journal Resources, Conservation & Recycling in the areas of Engineering and Environmental Sciences (10.204) stands out with 6 papers that relate to sustainability, big data, indices and mathematical models. On the other hand, the journal Technological Forecasting and Social Change published the most cited paper, but this is not the journal with the highest impact.  If, in addition to the number of publications, the total number of citations and the hindex, the g-index and m-index are taken into account (Table 6), the journal with the greatest relevance in the subject analysed is the Journal of Cleaner Production. This coincides with the thematic distribution presented above, in which the areas of environmental sciences are the most relevant in this field. Bradford's Law or Bradford's Law of dispersion of scientific literature is the description of a quantitative relationship between journals and scientific articles contained in a bibliography dealing with a given topic. It allows us to select the journals that are not only the most productive but also the most relevant to cover the area of knowledge being analysed [75,76]. The core of the most productive ones is represented in Figure 13, in which zone 1, covering 33% of the publications, is covered by only five journals, which fully coincide with the first journals in Table 6.

According to Sustainability Indicator and SDGs
In the review, the papers were classified according to whether they were empirical, theoretical or bibliometric. Almost 70% are empirical, just over 20% are theoretical and 10% are bibliometric.
Without taking into account bibliometric works, environmental sustainability is mainly studied (91.3%) either exclusively (43.6%), jointly with economic sustainability (10.8%), with social sustainability (2.1%) or with all three areas (34.8%). The least analysed type of sustainability is social sustainability (Figure 14). Without taking into account bibliometric works, environmental sustainability is mainly studied (91.3%) either exclusively (43.6%), jointly with economic sustainability (10.8%), with social sustainability (2.1%) or with all three areas (34.8%). The least analysed type of sustainability is social sustainability (Figure 14). The SDG that has received the most attention ( Figure 15) is SDG9 (build resilient infrastructure, promote inclusive and sustainable industrialisation and foster innovation), followed by SDG11 (make cities inclusive, safe, resilient and sustainable) and further behind, SDG13 (take urgent action to combat climate change and its impacts), SDG6 (ensure access to water and sanitation for all) and SDG7 (ensure access to affordable, reliable, sustainable and modern energy). SDG1, SDG2, SDG5 and SDG10 have not been addressed and little attention has been paid to SDG16. Theoretical work has focused more on SDG9 (30.9%), SDG8 (14.7%) and SDG11 (14.7%). The SDG that has received the most attention ( Figure 15) is SDG9 (build resilient infrastructure, promote inclusive and sustainable industrialisation and foster innovation), followed by SDG11 (make cities inclusive, safe, resilient and sustainable) and further behind, SDG13 (take urgent action to combat climate change and its impacts), SDG6 (ensure access to water and sanitation for all) and SDG7 (ensure access to affordable, reliable, sustainable and modern energy). SDG1, SDG2, SDG5 and SDG10 have not been addressed and little attention has been paid to SDG16. Theoretical work has focused more on SDG9 (30.9%), SDG8 (14.7%) and SDG11 (14.7%).  In the empirical studies, sustainability indicators refer primarily to SDG11 (28.1%) and SDG9 (25.5%). Among the mathematical techniques used to analyse sustainability, based on big data, different multidimensional regression techniques stand out in 13% of the studies. Of these, 12% use neural networks and in the same proportion, analysis of different indicators (financial, lifecycle, entropy, tourism, mobility, risk, resilience, etc., depending on the topic analysed in the respective papers). The fuzzy methodology, either applied to the Delphi, the entropy weight method or the decision-making trial and evaluation, has been applied in 7% of the works. Multi-criteria decision-making (many of them with TOPSY) is used in 11.4% of the papers. Factor analysis, efficiency analysis and Geographic Information System are present in 6.5% of the empirical papers analysed. Various optimisation techniques such as cost minimisation or profit maximisation are found in 5.4% of the papers and predictive techniques in 5.9%. Qualitative analyses are also used, such as sentiment analysis (9.2%).

According to Spatial Level
The result of the WoS search, in the empirical works integrating spatial levels, shows 223 documents: 35% are developed in China, 7.5% are applied globally, 6.2% in India, 5.7% in the USA and the rest in different European and Asian countries. There is a predominance of Asian and North American organizations that support and fund research. The microlevel accounted for about 23%, and meso and macro for about 43% and 34%, respectively ( Figure 17). Relating the spatial component to the type of sustainability studied in each paper (Figure 18), at the macro-level, there is a predominance of papers that analyse the three types of sustainability together (47%) and those that work only on environmental sustainability (35%). Social sustainability is only analysed at this spatial level. At the meso-level, most of the papers consider only the environmental part (59%), although there is also a high percentage of articles that analyse all three types of sustainability (39%). At the micro-level, the papers are more evenly distributed according to the type of sustainability addressed, with those analysing the environmental component once again standing out (32%). Most of the SDGs have been studied at all spatial levels. At the macro-level, SDG9 and SDG11 have the same weight. However, at the rest of the levels, SDG11 stands out from SDG9. At the meso-level, 21.3% of the works apply to SDG9 and 32.4% to SDG11. At the micro-level, 24.2% are dedicated to SDG9 and 27.4% to SDG11. There are SDGs that are only analysed at the macro-level (SDG16, SDG17) and only SDG4 and SD14 have not been analysed for large countries (Figure 19). The mathematical techniques most used in empirical work at the macro-level are neural networks and those of a qualitative nature, such as sentiment analysis. To a lesser extent, regression and Geographic Information System techniques are also applied. At the meso-level, the most commonly used techniques are multi-criteria decision-making methods, although there are also studies where regression techniques and factor analysis are used. At the micro-level, there is a predominance of works that analyse different indicators related to the subject matter of each one of them (generally of an environmental nature) and carry out analyses of the efficiency and effectiveness of specific companies.
The articles that work with neural networks and multi-criteria decision-making techniques are more inclined to analyse SDG9. The analysis of different indicators or regression techniques are more applied in SDG11.

Discussion and Conclusions
Achieving sustainability is the most complicated situation facing our planet. With the emergence of big data, which makes it possible to manipulate and obtain information from a large amount of data, and the development of mathematical tools, ways are being sought to achieve the desired sustainability as soon as possible.
In this context, it is very useful to know the existing scientific production on the use of big data to solve problems, both theoretical and empirical, to make a product, a company, a city, a country, etc., sustainable, all this through indicators and mathematical techniques. At the same time, and in compliance with UNESCO's Agenda 2030, it is important to see the relationship between the documents found and the 17 sustainable development goals in order to find out which SDGs most attract researchers. In this way, it will be possible to identify the main lines of research on this topic and provide possible suggestions for future research.
The bibliographic database chosen to locate the scientific production on this topic was the Web of Science, as it is one of the most important databases in the world for bibliographic references and citations of periodical publications. Once the documents obtained with the chosen keywords had been purified, 325 articles were analysed. Of these, 34 are bibliometric studies, 68 are theoretical works and the remaining 223 are empirical.
The number of publications has grown exponentially, which shows the enormous importance of these lines of work among researchers. The areas of knowledge that deal with this subject are related to environmental sciences, technology, engineering, computer science and economics.
The most productive research institutions are located in Peoples R China, with The Dalian University of Technology standing out with publications that focus mainly on studies in the field of environmental sciences and engineering. China is also home to the largest number of scientific researchers, followed by the United States.
It was found that almost all publications are authored by more than three authors, although more than 90% of the authors are occasional authors with only one signed article. The author with the highest number of publications is Ming-Lang Tseng, with papers on sustainable supply chain management. Donald Huisingh (with three documents) is the author with the most citations, more citations per document and per year. Ming-Lang Tseng ranks second in the total number of citations and is surpassed by other authors when the average number of citations per article and per year is taken into account. In the top 10 authors with the most publications on the topic analysed, Ming-Lang Tseng has a higher h-and g-index than the others and is the second most productive behind Sachin S. Kamble.
The papers with the highest number of citations and the most citations on average per year are El-Kassar and Singh [63] and Yigitcanla et al. [64]. The former studies how to improve business performance by innovating to improve global sustainability and the latter analyses all papers on smart cities.
Sustainability, Journal of Cleaner Production and Sustainable Cities and Society are, in that order, the journals with the highest number of published papers, and together with ISPRS International Journal of Geo-Information and IEEE Access, cover 33% of the publications. Of these journals, the Journal of Cleaner Production is the most relevant in the subject studied, taking into account not only the number of publications, but also the total number of citations and the h-, g-and m-indexes.
The environmental part of sustainability has received the most attention from researchers, followed by the economic and, to a lesser extent, the social. The greatest concern is with promoting innovation and entrepreneurship (SDG9) and making cities more environmentally, economically and socially sustainable (SDG11). There is no work on SDG1, SDG 2, SDG 5 and SDG 10, as these have not been taken into account in the selected papers.
In the empirical papers, the mathematical techniques used to analyse sustainability based on big data include multidimensional regression, neural networks and the analysis of different indicators depending on the sector in which the research is applied. Methodologies based on fuzzy analysis, multi-criteria decision-making, factor analysis, efficiency analysis, Geographic Information Systems and optimisation techniques are also used together with qualitative analysis, such as sentiment analysis.
When taking into account the spatial level at which empirical studies are carried out, meso-level studies predominate, followed by macro-and micro-level studies. At the macro-level, the focus is more on analysing the three spheres of sustainability together, with the works that only take into account the environmental dimension also being important. At the meso-level, there are more publications on the environmental part, followed by the three types of sustainability. At the micro-level, the papers are more evenly distributed according to the type of sustainability dealt with, with those that analyse the environmental component once again standing out (32%). Analysing only the social part is performed at the macro-level, with studies of mobile applications and social networks in particular.
Most of the SDGs have been studied for all spatial levels. At the micro-and mesolevels, SDG11 stands out against SDG9, while at the large country level, both SDGs have been studied in the same proportion. There are SDGs that are only analysed at the macrolevel (SDG16, SDG17), and SDG4 and SDG14 have not been studied for large countries.
At the macro-level, neural networks, qualitative techniques, regression and Geographic Information System are more commonly used. At the meso-level, the most commonly used techniques are multi-criteria decision-making methods, and at the micro-level, different indicators are analysed, mainly environmental indicators, and company efficiency analyses are carried out.
The results show that big data applied to supply chains, Industry 4.0 and the achievement of sustainable cities are the most important topics of current trends. One of the most interesting topics is the study of innovative solutions for supply chains within production, with the aim of achieving environmentally, economically and socially viable production.
Among the main contributions of this research is that different branches of research in the field of sustainability analysis with big data have been identified and that, according to the analysis of the results, the growing role of environmental issues and Sustainable Development Goals 9 and 11 stands out.
Regarding the limitations of this research, it should be noted, firstly, that this study has been restricted to the database WoS. Secondly, only articles have been analysed. It would be interesting to consider a broader line of research that would include other databases such as Scopus and Google Scholar and other types of publications such as books or conference proceedings.