Big Data as a Tool to Monitor and Deter Environmental Offenders in the Global South: A Multiple Case Study

While prior research has looked at big data’s role in strengthening the environmental justice movement, scholars rarely examine the contexts, mechanisms and processes associated with the use of big data in monitoring and deterring environmental offenders, especially in the Global South. As such, this research aims to substitute for this academic gap through the use of multiple case studies of environmental offenders’ engagement in illegal deforestation, as well as legal deforestation followed by fire. Specifically, we have chosen four cases from three economies in the Global South: Indonesia, Peru and Brazil. We demonstrate how the data utilized by environmental activists in these four cases qualify as true forms of big data, as they have searched and aggregated data from various sources and employed them to achieve their goals. The article shows how big data from various sources, mainly from satellite imagery, can help discern the true extent of environmental destruction caused by various offenders and present convincing evidence. The article also discusses how a rich satellite imagery archive is suitable for analyzing chronological events in order to establish a cause-effect chain. In all of the cases studied, such evidentiary provisions have been used by environmental activists to oblige policy makers to take necessary actions to counter environmental offenses.


Introduction
For any enterprise, data has increasingly becoming one of the most important commodities. New technologies and data processing systems have allowed large data sets, called "big data", to be captured and analyzed. These data sets allow us to find correlations and associations that we could not have identified through conventional approaches. Environmental security is one critical area being reshaped by big data. New approaches offer the best available knowledge for academics, companies and policymakers, helping them to make deeply educated choices and refine their approach to resource use and promote environmental sustainability. We define environmental sustainability as

Literature Review
In the context of the Global South, the term big data is defined as "data sets that can provide insights into human well-being, which satisfy at least one of the following characteristics compared to data sets that have been traditionally used in developmental issues: (a) are of higher volume, (b) are of wider variety, and (c) enable users to make decisions and act faster" [7].
Prior research indicates that big data holds a great promise for environmental monitoring, protection and planning in the Global South [1,8]. There is some evidence that big data-related tools have made it easier to detect and quantify challenges such as deforestation, desertification and climate change, which was not possible until few years ago [9]. For instance, the Global Forest Watch (GFW) is a platform launched by the World Resources Institute for mapping big data related to forests in near-real time. It updates data and images every few weeks and daily in the case of fire alerts. Using big data, the cloud and crowdsourcing, it helps to study changes in tree cover with the help of indicators related to deforestation, harvesting of tree plantations, fire damage and forest die-off from disease and pests. As such, analysts, policymakers, conservationists and others can use it to track progress on efforts to conserve forests [10].
Moreover, prior researchers have also suggested that there are various barriers to be taken into account when utilizing big data for environmental monitoring, protection and planning in the Global South [1,8,11,12]. The types of data available for the majority of developing economies are limited in many ways (e.g., related to Tweets in Indonesia and mobile money transfer data in Kenya).
Data unavailability thus remains a major challenge, which, according to Boyd and Crawford (2012) [11] has led to a new form of digital divide. While appropriate analysis of big data may provide valuable insights and information for key policy areas, great care must be taken to ensure that data quality standards are satisfied and appropriate methodological steps have been taken. For instance, the use of Twitter API data has been criticized on the grounds that it suffers from questionable quality and serious methodological challenges such as samples of unknown representativeness, a lack of one-to-one correspondence between accounts and users as well as the proliferation of Tweets created by bots [11,12].
Many of the preceding researchers in this field have also emphasized the importance of appropriate strategies to adequately harness the various dimensions of big data. With regard to volume, for instance, Boyd and Crawford (2012:663) [11] note that big data is a "poor term" and argue that big data "is less about data that is big than it is about a capacity to search, aggregate, and cross-reference large data sets". It would thus be interesting to observe how actors such as environmental activists search and aggregate data from various sources and use them as a tool to monitor and deter environmental offenders. More specifically, research that considers the key aspects and characteristics of big data and can therefore help achieve such a goal is relevant for the Global South. This has been an important research gap in the literature on big data and the environment.

Methods
Our theory has been derived from having studied multiple cases [1,[13][14][15][16][17]. We selected only cases for which sufficient information could be obtained from secondary sources. Note that archival data is among a variety of recognized data sources for case studies [14,18]. Following Eisenhardt [13,14], we selected four cases by combining two approaches: the extreme method and the diverse method [19]. More specifically, our process started with the extreme case method and evolved over time in order to implement a variety of requirements and recommendations.
In the extreme case method, cases with extreme values on the independent (X = nature and extent of the environmental damage caused by human beings) or dependent variable (Y = source and amount of data and use of big data as a tool to monitor and deter environmental offenders) of interest are selected [19].
In the cases we selected, satellite data have been used as a main source, which are considered to be big data [1]. While satellites also provide structured data, they are an especially important source of rich unstructured data such as images that can be used to monitor and deter environmental offenders. For instance, it was reported that in 2012, a Russian weather satellite was able to take 121-megapixel (MP) images of Earth, which was more than 10 times the resolution size of the images that high-end smartphones could take at that time [20].
The cases analyzed in this article are extreme in the sense that they are among the most considerable man-made environmental disasters in history (in terms of X = nature and extent of the environmental damage caused by human beings). In particular, prior researchers have suggested that best practice models are good candidates for case research [13]. In this case, if researchers have some idea about other factors that might affect Y (the outcome of interest), other case selection methods can be pursued [19]. We utilized a diverse case method to select man-made environmental disasters. A key goal was to achieve a maximum possible variance along relevant dimensions [19]. The idea in this method was to select cases to represent full ranges of values characterizing X, Y or some relationships between them [19].
As for man-made environmental disasters, two main themes emerged: (a) In some cases, the offenders involved were only responsible for deforestation but in other cases they also burned the deforested area to clean-up for planting a new crop (b) the offenders either belong to the primary economic sectors such as fishing, agriculture and mining, or to secondary economic sectors such as manufacturing. In order to achieve diversity, we selected cases with different combinations of (a) and (b) ( Table 1).

Case 1: Amazon Deforestation Caused by Mining in Brazil
Researchers used time series data from Program for the Estimation of Deforestation in the Brazilian Amazon (PRODES) at a 1-km resolution [21]. They found that Amazon forest losses were due to mining occurred up to 70 km from mining lease boundaries. They estimated that 11,670 km 2 (roughly 4500 square miles) of deforestation occurred during 2005-2015, which represented 9% of all Amazon forest loss during the period. During 2005-2015, mining activities caused about 10% of the deforestation in the Brazilian Amazon, which is higher than the past assessments of 1-2%. A reason for the difference was that past studies primarily looked at the mines and ignored the ancillary developments such as infrastructure accompanying the mines. The findings regarding mining-led deforestation underscored the importance of immediate actions by companies and the Brazilian government, especially in light of the Presidential Decree of 23 August 2017 [22,23].
In August 2017, the then President Michel Temer issued two decrees in a one-week time span. The first Presidential Decree that was issued on 23 August 2017 abolished a huge national reserve in the Amazon, known as the National Copper and Associated Reserve (RENCA). The RENCA was established in 1984 in order to provide mineral wealth nationally, ultimately opening up about 4.6 million hectares (17,800 square miles) to mining. The Temer government's initial declared motivation was to attract foreign investment and improve exports in order to motivate Brazil's economy [24]. As such, the Presidential Decree met with protests from diverse groups such as lawmakers, scientists, environmentalists, artists, actors and singers. In this case, a Senator for the state of Amapá Randolfe Rodrigues criticized Temer's move to abolish Renca, labeling it as "the biggest attack on the Amazon of the last 50 years" [22,23].
As a result, the President was forced to issue a second decree. While Temer's second decree issued on 28 August 2017 did not reverse the abolition of the RENCA, it did introduce some restrictions. It prohibited the search for mineral deposits in indigenous areas, in nature preserves as well as in border regions, thus creating a review committee to ensure the proper implementation of these protections. A private lawsuit argued that a decree cannot abolish the Renca preserve without legislations stipulated by the government. Responding to the lawsuit, a federal judge ordered that the both decrees issued by Temer be annulled [24]. Although this was a small victory, big data was consequential in obliging Temer's administration to issue the second decree.

Case 2: The 2019 Wildfires in the Brazilian Amazon Rainforest
In 2019, more than 80,000 fires burned across Brazil, which led to the hospitalization of more than 2000 people due to respiratory illnesses caused by exposure to the resulting air pollution [25].
The web-based portal Monitoring of the Andean Amazon Project (MAAP), an initiative led by the U.S.-based Amazon Conservation and Peru's Conservación Amazónica (ACCA), published a report disclosing that at least 125,000 hectares of the Brazilian Amazon was cleared mainly in the first half of 2019 and then was burned in August of that year; these actions were arguably to convert the land for agricultural use [26]. The fires in the Amazon led to a global political crisis in which millions of social media users demanded action from Brazilian officials. Even more recognizable, various street protests took place in many parts of the world, compelling Bolsonaro's government to deploy troops in the Amazon to combat the fires. The Brazilian government also issued a decree which temporarily banned the use of fire to clear lands [27].
The MAAP provided compelling evidence that both deforestation and fire are critical challenges and "the fires are often a lagging indicator of recent agricultural deforestation". The MAAP came to this conclusion through an extensive analysis of a rich satellite imagery archive in the Rondônia, Amazonas, Mato Grosso, Acre and Pará areas of Brazil from 2018 through mid-September 2019. In order to do so, it employed the satellite company Planet's online portal, which has rich archival data of Planet, RapidEye, Sentinel-2 and Landsat data. The portal then used the area measurement tool to come up with an estimate of the size of the affected areas, identifying areas that were deforested prior to July 2019 and were later burned between July and September of that year. The Andean Amazon Project's (MAAP) forcefully argued: "This key finding flips the widely reported assumption that the fires are burning intact rainforests for crops and cattle. Instead, we find it's the other way around, the forests were cut and then burned, presumably to enrich the soils. It is "slash and burn" agriculture, not "burn and slash" [28].

Case 3: Deforestation of Rainforests in the Peruvian Amazon
It was reported that the Cayman Islands-based and London Stock Exchange-listed company, United Cacao, which promises to produce ethical and sustainable chocolate, had deforested about 7000 hectares (17,300 acres) of mostly primary, closed-canopy rainforest in the Peruvian Amazon [29]. In 2013, work on the cacao plantation started near the town of Tamshiyacu. The company tried to defend itself by claiming that the land had been previously cleared. An audio interview with Directors Talk Interviews, United Cacao's CEO, Dennis Melka said: "By the time the plantation companies actually get to the land, that land has been logged and clear-cut of all tropical hardwoods. It's simply not rainforest . . . " [30].
An analysis of satellite images, however, revealed otherwise. An analysis of data from Landsat satellite imagery showed that the land had not been previously cleared and primary rainforest was indeed cleared after it was acquired by the company [3]. The international campaigning organization, the Environmental Investigation Agency (EIA), using three-dimensional forest mapping data and satellite imaging data from Global Forest Watch and other sources to create Peru's map of carbon density, argued that the areas deforested by the Melka Projects were mostly primary forest before the launch of the project. Before the deforestation by Melka Projects, Tamshiyacu's average carbon stock values was 122 metric tons of carbon per hectare. An EIA researcher argued that such high levels of carbon stock values are only found in primary tropical forest in that part of the Amazon basin [31].
The Peruvian government claimed that lands in the country are classified based on a technical definition, known as best land use capacity (BLUC) which only includes soil and climatic characteristics [32]. This definition ignores the presence of standing trees in the evaluation of requests for land use change. Thanks to this loophole, approximately 20 million hectares of Peru's 74 million hectares of Amazon rainforest have not actually been classified as forest. This means that a significant part of the Amazon rainforest is open to being reclassified and labeled as agricultural land [31]. There is, nevertheless, another law which recognizes standing trees as part of the national forest patrimony. This means that the use of forests is prohibited for agriculture or other activities if such activities negatively affect the vegetation cover or the sustainable use and conservation of forest resources (Ley No. 27308: Forestry and Wildlife Law (2011) El Peruano) [32].
In August 2014, Peru's Ministry of Environment initiated legal actions to suspend the Melka Group's operations in Tamshiyacu and Nueva Requena. In December 2014, Peru's Ministry of Agriculture ordered United Cacao to stop the work on the plantation. The Ministry allowed 90 working days to produce a soil study that confirmed the site's ability to handle a cacao plantation [29]. However, it was reported in mid-2015 that no action had been successful in preventing the companies' further operations [31].

Case 4: The 2015 Indonesian Fires
The 2015, Indonesian fires were arguably the "worst manmade environmental disaster since the BP gulf oil spill" [33]. Thousands of fires were deliberately started to clear land for palm oil and paper products. During January-October of 2015, over 117,000 forest fires had been detected via satellite in Indonesia, most of which were suspected of being initiated deliberately to clear land for farming [34]. Guido van der Werf of VU University Amsterdam's Faculty of Earth and Life Science estimated the quantity of fire emissions emitted based on satellite imagery of the fires and of vegetation [35]. Based on this, he estimated that in 2015, Indonesian fires had emitted roughly 1713 million metric tonnes of equivalents of carbon dioxide as of November 9. These estimates come from data collected by satellites, which can sense fires. Some examples include Germany's TET-1 and NASA'S MODIS, the latter of which spreads worldwide and transmits data 24/7. Notably, active fires emit radiations that can be picked up by the satellites on bands that are dedicated to this purpose. From this information, not only can scientists calculate size, temperature and number of fires, but researchers can also calculate the amount of emissions produced [36].
Indonesian [37] and foreign media [3] used findings from World Resources Institute (WRI), the Global Fire Emissions Database (GFED) and other resources to pressure the Indonesian government into implementing more responsive strategies to the tragedy, acting on the issue and adopting better peatland management practices. The environment activist group Greenpeace collected and presented evidence in a clear and thorough manner, releasing video footage produced by drone surveillance. Greenpeace Indonesia's drones found that land burned in the autumn of 2015 on the island of Borneo were turned into palm oil plantations after a few weeks [38]. Greenpeace researchers examined about 112,000 fire hotspots recorded from August 1 to October 26, 2015, which showed that about 40% of the fires had taken place inside so called mapped concessions, which is land granted by the government to companies for logging or plantation development [39]. Greenpeace researchers further discovered that Asia Pulp & Papr, the largest concession holder in Indonesia, was the company associated with a majority of the fires.
Furthermore, big data-based evidence is likely to be more convincing and likely to prompt policy changes, the pressures of which have produced some desirable outcomes. Indonesia's Environment Minister emphasized the importance of revising the country's environmental laws. The country's Environment and Forestry Minister Siti Nurbaya noted that the 2009 Law on Environmental Protection and Management allowed farmers to burn up to two hectares of land to clear space for farming. The law thus contributed to the nation-wide forest fires [40]. In October 2015, Indonesian President Joko Widodo instructed the environment and forestry minister to stop issuing new permits on peatlands and to immediately begin revitalization [41]. Similarly, Singapore issued legal notices to Asia Pulp & Paper and four other Indonesian companies whose concessions are full of fires causing air pollution across the region. According to Singapore's Transboundary Haze Pollution Act of 2014, foreign companies can be held responsible for polluting air and can be fined up to 2 million Singapore dollars (US$1.4 million) [42].

Findings, Discussion and Implications
According to our findings, prior research has suggested that big data can be of tremendous service to developing countries [7]. As such, a key contribution of this paper is to provide insights into this phenomenon by revealing, through an examination of a number of case studies, how big data can be used as an effective tool to monitor and deter environmental offenders in the Global South. In Table 2, we present our findings related to the mechanisms involved and outcomes from the use of big data-based evidence in the selected cases. In all four cases, the government agencies involved were forced to take actions due to the convincing nature of big data-based evidence. In many cases, non-big data techniques are likely to underestimate the true extent of an environmental problem.
Big data-based techniques, however, are in a better position to document the gravity and dourness of the problem. This point is well-illustrated by the analysis of Amazon deforestation caused by mining in Brazil (Case 1). Earth observation satellites, which gather huge amounts of data about the Earth's physical, chemical and biological characteristics, which is also known as the Earth system every day [44], are of critical importance. The data gathered by satellites are used for civil, military and commercial activities.
The roles of Earth observation satellites in fighting against environmental offenses need to be analyzed in the context of the current digital divide, which is associated with, and facilitated by the pattern of diffusion and use of big data [7]. Hilbert [45] has observed that the current inequality of technological capacity between economies in the Global South and their Northern counterparts represents a more mature, and more persistent, stage of the digital divide. It is argued that limited access to big data has created new forms of digital divides and the lack of financial support to afford data is among many factors that may contribute to the digital divide [11]. These satellites provide the same level of richness of data for remote areas of least developed countries as for urban areas of developed countries. Archival satellite data can help establish a cause and effect linkage between events, which can help determine the causes of environmental degradation and problems, and identify environmental offenders. In the above examples, activist groups such as Greenpeace, the Environmental Investigation Agency (EIA) and Conservación Amazónica used archival satellite data to monitor changes in the land surface and create sequence of events such as deforestation, land-clearing wildfire and land development. These data were used to argue that the areas deforested by the Melka Projects were mostly primary forest before the launch of the project.
The MAAP performed an extensive analysis of a rich satellite imagery archives of areas affected by fires in Brazil from 2018 through mid-September 2019 and concluded that fires take place after agricultural deforestation. In this way, data generated by satellites and other sources can be effectively used as an evidence to pressure policy makers to address environmental issues they would most likely otherwise ignore, for political or other reasons. The above discussion suggests that the biggest impact is likely to occur when big data is utilized to promote transparency and reduce corruption. Environmentalists and others have put forward convincing arguments that are based on hard data. Actors engaged in illegitimate conduct find it difficult to defend themselves against arguments that are firmly supported by data and analysis. For instance, activists' actions were crucial in pressuring the Peruvian government to impose sanctions against United Cacao, which allegedly deforested about 7000 hectares of rainforest in the Peruvian Amazon.
Boyd and Crawford (2012:663) [11] argued that big data is more about "a capacity to search, aggregate, and cross-reference large data sets" rather than the volume of data. In this regard, data utilized by environmental activists are true forms of big data because these activists have searched for and aggregated data from various sources and use them as a tool to monitor and deter environmental offenders. For instance, in order to establish the severity of the 2015 Indonesian fires (Case 4) data was obtained from multiple satellites, which can sense fires such as Germany's TET-1 and NASA'S MODIS. More importantly, a rich satellite imagery archive is suitable for analyzing chronological events in order to establish a cause-effect chain. For instance, Planet's online archival data helped the MAAP to prove that the 2019 wildfires in the Brazilian Amazon rainforest were used as a form of "slash and burn" agriculture, not "burn and slash" (Case 2) [43].
For many decades, citizen science, in which members of the general public collect and analyze data related to the environment, often in collaborations with scientists, has been used in environmental monitoring [46]. However, such data often faces the problem of authority, reliability, validity and credibility and thus has been unable to lead significant political actions [47]. Some environmental activists may make mistakes in their data, which can create legitimacy challenges when using such data to build the case for environmental actions [9]. Such challenges are less likely when data from more widely recognized sources such as satellites are used. Due to the strong association between environment and healthcare as well as agriculture, the use of big data to monitor and deter environmental offenders is also important. For instance, during the 2015 Indonesian fires, air quality indexes in the many provinces such as Central Kalimantan, South Sumatra, Jambi and Riau reached hazardous levels [1]. It was estimated that fire and haze affected the health of over 120,000 people [1]. Following the death of fifteen-month-old Latifa Ramadani from a respiratory infection, a senior politician, Fahira Idris, asked local health authorities in the provinces of Sumatra and Kalimantan to perform additional checks on all babies and toddlers [1].
Likewise, agriculture is an environmentally sensitive industry. Agricultural productivity is tightly linked to various characteristics of the environment. Returning to the example of the 2015 Indonesian fires, Indonesian farmers were expecting a poor harvest because plants received insufficient sunlight required for normal photosynthesis. As of November 2015, the haze had killed at least 10 people and about 504,000 had become sick in Borneo and Sumatra. Analysts believed that the actual number of deaths was much higher than this figure [48].

Future Research Implications
Before concluding, we suggest several potentially fruitful avenues for future research. In this research, we concentrated on the use of big data as a tool to monitor and deter environmental offenders in the Global South. Big data in the cases selected in this study were mainly from satellites. Prior research has noted that higher precision can be obtained by combining machine learning with satellite data [4]. In the future, research scholars also need to consider the combination of other tools such as artificial intelligence, machine learning and blockchain with big data tools for such purposes.
In this paper, we looked at the use of big data, mainly from satellites, in promoting environmental sustainability. Such data can also help deal with other environmental, social and governance (ESG) issues. For instance, some technology start-ups are planning to use aerial imagery to determine whether a mining company has employed children in its operations [4]. A second area of future research might be to analyze how big data from sources such as satellites and aerial imagery can be used to promote social sustainability, such as by indicating the impacts on rural communities and promoting the reduction of child labor and human rights violations.
In this paper, we focused man-made environmental disasters in the Global South. However, there have also been numerous instances of man-made disasters that have damaged the environment in the Global North economies. An example of this is the Deepwater Horizon oil spill in the Gulf of Mexico in 2010, which resulted from the BP oil well blowout in the northern Gulf of Mexico damaged the shorelines of Louisiana, Alabama, Mississippi and Florida in the U.S. About 500,000 tons of gaseous hydrocarbons was released, which severely polluted the ocean water in nearby areas [49]. In the cases that we analyzed, we found the impacts on regulatory policy of such disasters. Future researchers should explore whether such regulatory impacts also affect Global North economies.

Concluding Comments
The data problem facing environmental monitoring stations worldwide stems primarily from the dramatic advances in sensor technology being set in motion. Advances in design and production have greatly increased the quality of data produced by these sensors, making them more accurate than in the past. In this view, researchers may provide the best possible view of the challenge they are facing with big data and big data analytics, enabling them to see patterns they may have overlooked with smaller data sets or less sophisticated computational techniques. In the context of the developing Global South, this technological advancement can provide significantly more sustainable development paths for the future.
In this article, we presented a variety of examples that illustrate how big data can be used to produce positive results to promote environmental sustainability. Environmentalists are presenting big data-based evidence in order to pressure politicians and corporations responsible for environmental damage, degradation and harm. Such actions can promote transparency and accountability in environmental management. The evidence illustrated in the above case of the 2015 Indonesian fires, deforestation of rainforests in the Peruvian Amazon and the 2019 wildfires in the Brazilian Amazon rainforest can trigger important policy changes that can significantly promote environmental sustainability.
The above analysis has revealed two important mechanisms by which big data can help monitor and deter environmental offenders in the Global South. First, the temporal aspect of big data associated with rich archival data from satellite companies can establish a cause and effect linkage between events and identify the environmental offenders and their motivations. Second, these satellites capture high megapixel images, which are key forms of unstructured data that can help achieve the goals of deterring illegal and immoral forms of environmental conduct.