Earth Observation and Machine Learning to Meet Sustainable Development Goal 8.7: Mapping Sites Associated with Slavery from Space

: A large proportion of the workforce in the brick kilns of the Brick Belt of Asia are modern-day slaves. Work to liberate slaves and contribute to UN Sustainable Development Goal 8.7 would beneﬁt from maps showing the location of brick kilns. Previous work has shown that brick kilns can be accurately identiﬁed and located visually from ﬁne spatial resolution remote-sensing images. Furthermore, via crowdsourcing, it would be possible to map very large areas. However, concerns over the ability to maintain a motivated crowd to allow accurate mapping over time together with the development of advanced machine learning methods suggest considerable potential for rapid, accurate and repeatable automated mapping of brick kilns. This potential is explored here using ﬁne spatial resolution images of a region of Rajasthan, India. A contemporary deep-learning classiﬁer founded on region-based convolution neural networks (R-CNN), the Faster R-CNN, was trained to classify brick kilns. This approach mapped all of the brick kilns within the study area correctly, with a producer’s accuracy of 100%, but at the cost of substantial over-estimation of kiln numbers. Applying a second classiﬁer to the outputs substantially reduced the over-estimation. This second classiﬁer could be visual classiﬁcation, which, as it focused on a relatively small number of sites, should be feasible to acquire, or an additional automated classiﬁer. The result of applying a CNN classiﬁer to the outputs of the original classiﬁcation was a map with an overall accuracy of 94.94% with both low omission and commission error that should help direct anti-slavery activity on the ground. These results indicate that contemporary Earth observation resources and machine learning methods may be successfully applied to help address slavery from space.


Introduction
It is estimated that over 40 million people in the world can be classed as modern slaves, unable to leave or refuse exploitative activity [1]. Ending slavery has been a goal of numerous agencies for centuries. Anti-slavery activity was further supported in 2016 with the launch of the United Nations Sustainable Development Goals (UN SDGs), specifically goal 8.7, which seeks to promote full productive employment and decent work for all and end modern slavery by 2030 [2,3]. These laudable from systems such as Digital Globe's WorldView systems, which provide multispectral imagery with sub-meter resolution.
The second key technological development that supports this study is the rise of citizen sensing as a source of ground reference data [15][16][17]. The full potential of remote sensing as a source of information often hinges on the availability of high-quality reference data to help train image analyses and to evaluate the quality of predictions. Reference data are, however, challenging to acquire. However, the rise of citizen sensing, which takes many forms and may be described by various terms [18], can now provide ground reference data to support analyses of remotely sensed images. Although citizen science has a long history, it is only since the proliferation of mobile devices that are location aware and the development of web 2.0 technology that its use for ground reference data collection in support of remote sensing studies has become practical. This radical change has facilitated the development of the broad area of volunteered geographic information (VGI) that is impacting greatly on geography and geospatial science as a whole [19]. VGI for ground reference data can be collected in various ways. The data collection could be an active process steered by researchers or passive, making use, for example, of information freely available on social media sites. This study followed an active strategy, with citizen sensors or volunteers provided with images from selected locations via an internet platform. The platform used here was the Zooniverse platform [20] launched in 2009. From earlier work, it is known that volunteers can provide accurate annotations of brick kilns [12]. As based on VGI, the process is, however, fraught with challenges ranging from concerns over data quality through to the ability to sustain an active set of high-quality contributors in time [21].
The third and final technological development of relevance to this study is the recent development of advanced image classifiers. The classification methods used in remote sensing have increasingly moved away from conventional statistical classifiers to machine learning methods, with numerous studies indicating the latter can yield more accurate classifications [22][23][24]. In addition, there have been considerable developments in object-rather than pixel-based image analysis tools [25]. Methods for automated image analysis have progressed greatly in the recent past. For example, the ImageNet challenge 2012 was won with a convolution neural network (CNN) [26]. A CNN is a type of deep feedforward neural network, which has been found to be effective for the analysis of imagery. The basic CNN approach was further developed into region-based, R-CNN, an enhanced classifier but one that is slow [27]. Work to enhance the speed of analysis resulted in the Fast R-CNN and then in 2015 the Faster R-CNN [28]. The Faster R-CNN takes an image as input and generates from it a set of rectangular object proposals that are sub-sets of the image that are predicted to contain within them an object of interest at a stated degree of membership (e.g., the probability of the proposal containing a brick kiln). The potential of these CNN-based classifiers has been recognised in remote sensing, and a range of classifiers have been applied to remote sensing images for the detection of various object types (e.g., [29][30][31][32][33][34]). The output of a region-based CNN classification is typically a set of proposals that potentially contain a target of interest set within a rectangular bounding box that fits closely around the target together with an estimated measure of the strength of membership to the target class (e.g., a probability of belonging to the target class).
The three main technological resources used in this work are all highly contemporary, and just a decade ago this work would have been impossible. Here, it is hoped they can be used to shape a method for routine and inexpensive monitoring to aid addressing UN SDGs. The aim of this paper is to explore the potential offered by these technological developments to map the locations of sites such as brick kilns, in order to focus anti-slavery resources and activity effectively.

Materials and Methods
The research focused on Bull's Trench brick kilns as these are known to be associated with slave labour [6]. These kilns have a characteristic shape, typically oval although some are circular and often with a tall chimney in the middle, see Figure 1. The size of the kilns varies but, as a guide, the radius of circular kilns within the region is often in the order of 33 m. Both of these kiln types exist across the "Brick-Belt" and so any approach to their mapping has broader applicability. Of note is that the size of the brick kilns would be close to the pixel size of popular environmental remote sensing systems such as those carried on Landsat satellites, making them hard to observe, but they are considerably larger than the pixel size of fine spatial resolution satellite sensors and hence visually readily identifiable in such imagery, see Figure 1. Moreover, with resources such as Google Earth, free fine spatial resolution satellite sensor data are readily accessible. The following subsections summarise the data and methods used to explore the potential for contemporary methods and data sets to be applied to map brick kilns and contribute to the achievement of UN SDG 8.7.

Study Area and Imagery
Attention focused on an approximately 120 km 2 region in Rajasthan, India, see Figure 2. This region was selected as it is known to have a higher than average density of brick kilns [12] and so should furnish sufficient cases to both train and evaluate the classification methods. The machine learning methods were applied to the remotely sensed imagery displayed in Google Earth. The imagery, in RGB format, were downloaded for the analyses. The classifications were trained using imagery in the period 2003-2016 for the small region highlighted in the black box in Figure 2. The quality of the brick kiln classifications produced was evaluated using data from the broader study area, highlighted by the white box in Figure 2. For the accuracy assessment, attention focused on imagery of this region acquired in 2017 by the WorldView-2 system with a spatial resolution of approximately 0.5 m available in Google Earth. The region used for accuracy assessment contained 178 brick kilns, all identified by visual classification.

Analyses
The brick kilns were mapped from the imagery contained in Google Earth using contemporary classifiers. Given that the brick kilns are distinct objects of characteristic appearance, the conventional classifiers commonly used in remote sensing research that are based mainly on spectral information were not used. Instead, attention focused on contemporary machine learning methods that have the potential to identify objects in images using especially, shape information. The basic nature of the methods is that they learn to identify the object of interest after presentation of a set of training examples. Key features of brick kilns to be learned relate to their size and shape as well as the presence of a central chimney that often casts a long shadow, see Figure 1, and need to be distinguished from other objects that may show similarity (e.g., road roundabouts) including abandoned kilns, see Figure 1b.
To meet the aims of the study, attention focused on three sets of analyses. First, the imagery were initially classified using the Faster R-CNN. Second, a reclassification of the outputs obtained from the Faster R-CNN was undertaken with a CNN. Third, the accuracy of the brick kiln classifications obtained from the two previous analyses was assessed using human visual interpretations obtained by crowdsourcing as ground reference data.

Classification with Faster R-CNN
The Faster R-CNN was used to detect the brick kilns in the imagery. The Faster-RCNN is a two-stage object detector, including the Region Proposal Network (RPN) module and the Fast R-CNN module. RPN is used to generate candidate object regions, and the Fast R-CNN module is used to classify these candidate object regions and refine their locations. Both modules share the same convolutional features, and the VGG16 net, which contains 13 convolutional layers, was used in this study [28].
Training data comprised brick kilns identified within a small region, highlighted by the black box in Figure 2, which was studied in detail. These training data were extracted from Google Earth, using imagery of the study area acquired within the period 2003-2016. All images were divided into 800 x 800-pixel sub-images. Those sub-images that contained a brick kiln, identified visually, were used to form the training data set for the Faster R-CNN. Since the machine learning methods used typically require large training sets, replicates were produced by rotating the selected images by 90 • , 180 • and 270 • to further increase the number of training samples available. In total, 572 training images were acquired, and they provided 1084 representations of brick kilns to inform the learning process. Examples of the training images used are shown in Figure 3. The Faster R-CNN was trained end-to-end by the Stochastic Gradient Descent (SGD) approach, using the training imagery along with annotated bounding boxes around each brick kiln. During the training of the classifier, a proposed region that has an Intersection-over-Union (IoU) overlap higher than 0.7 with any ground reference box of brick kilns was considered as a sample of a brick kiln, while that which had an IoU ratio lower than 0.3 was considered as a sample of background (i.e., a member of the non-kiln class). With the trained Faster R-CNN, brick kilns in the whole study area can then be detected. To overcome the graphics processing unit memory bottleneck, the detection over the whole study area was performed using smaller sliding windows in the large image. The original image was scanned with the window at the size of 800 × 800 pixels, and an overlap region, based on typical kiln size, of 200 × 200 pixels was used. The non-maximum suppression method was finally applied to eliminate the repeated detection in the overlap regions. In total, the training time was approximately 22 h.
The focus in this work is on the areas identified as potentially containing a brick kiln by the Faster R-CNN with a probability >0.5, although other probability thresholds were explored. The end product from the application of the Faster R-CNN to the imagery was an output in which the potential location of proposals, sites that could potentially contain a brick kiln, was highlighted.

Re-Classification with a CNN
The map produced from the Faster R-CNN analysis could be enhanced in a variety of ways. One is to refine the predictions with a further classification. Thus, a second analysis was undertaken based on the application of an additional classification achieved here by applying a CNN to the outputs that had been generated from the Faster R-CNN analysis.
The GoogLeNet was used as the CNN classifier [35]. The input to the CNN classifier was an image with the size of 221 × 221 pixels, and the output was the class that this image belonged to. Here, each input image was classified as either brick kiln or not. The same 572 images used to train the Faster R-CNN model were used to fine tune the pre-trained GoogLeNet-based CNN classifier. Training images were prepared for each class. For the class of kiln, the 1084 tagged representations of kilns were used. A 221 × 221-pixel image in which the brick kiln is located in its centre was extracted for each tagged kiln, and all 1084 extracted images constitute the training samples for the class of kiln. The 572 small training images were again fed into the Faster R-CNN model, and all proposal regions for kilns with a probability >0.5 were extracted. Among those proposal regions, there are many background regions that were wrongly classified as brick kilns. These incorrectly classified regions, often called negative samples, were included as training samples of the non-kiln class. Moreover, some background regions that do not include kilns were randomly selected from these training images and also used as the training samples of the non-kiln class. Examples of training samples for the kiln and non-kiln class are shown in Figure 4. In total, the CNN training time was approximately 6 h. Once trained, the CNN classification was applied to refine the class labelling for the sites identified as potentially containing kilns by the Faster-RCNN model. For each proposal region detected as a brick kiln by the Faster R-CNN, a 221 × 221-pixel image in which the centre of the proposal region is located in its centre was extracted. This image is further re-classified by the CNN classification. Images are either classified as kiln or non-kiln, and this process acts to reduce over-estimation of the kilns by the initial classification analysis by the Faster R-CNN.

Accuracy Assessment
The quality of the classifications obtained from the Faster R-CNN and that refined by the application of the CNN were assessed. This analysis was based on visual interpretation of the proposal regions highlighted in the outputs of the two machine learning methods. The image interpreters were those who had their labelling of brick kilns validated via a crowdsourcing analysis and hence were regarded as accurate labellers [12]. Similar to [12], each proposal region was labelled as kiln or non-kiln by human interpretation.
The quality of each of the brick kiln mappings obtained was estimated using standard measures of accuracy. A key focus was on the producer's accuracy or sensitivity of brick kiln mapping (the conditional probability of a proposal being labelled as a brick kiln and actually representing an area containing a brick kiln) and the errors of commission (cases that are non-kiln but mapped as kiln) and omission (cases of a brick kiln that were not mapped as such). Critical to the broader project in which this work is based is that errors of omission are regarded as being particularly severe; a map that over-predicts brick kilns but rarely if at all omits kilns is of greater value to anti-slavery groups working on the ground.

Results and Discussion
The output of the Faster R-CNN is, essentially, an image with a set of proposals for brick kilns highlighted by rectangular bounding boxes, see Figure 5. Each proposal is predicted to contain a brick kiln with an estimated probability greater than a predefined threshold. For the test site used to evaluate classifier performance, see Figure 2, a series of output thresholds were explored. In trial analyses, when a high probability such as 0.8 was used, proposals were highly accurate but it was evident that many kilns were being omitted. With the 0.5 probability threshold, 366 brick kilns were predicted for the region used in evaluating classification outputs. This was a substantial over-estimate but, critically, was associated with zero omission error, see Table 1. With all of the 178 kilns located in the area used to evaluate classification accuracy included in the map output, the producer's accuracy for kiln mapping was 100%. Although the test site is small, this result does suggest a potential for machine learning as a tool for mapping brick kilns over the entire Brick Belt. The one drawback in the results was the large error of commission, with 188 cases that were not actually brick kilns classified as brick kilns. However, these latter errors could easily be reduced or even removed by adding a second classifier. For example, all 366 proposals could be presented to human interpreters and labelled. Although calling on human input, this demand is highly focussed, with the human interpreters looking at only a very small proportion of the area. The outputs of the Faster R-CNN analysis could also be fed into a second automated classifier. This was explored using a CNN as the second stage classifier. It was evident that the addition of a second automated classification to the original Faster R-CNN output was very effective at removing most of the commission errors that had been committed by the Faster R-CNN. Specifically, of the commission errors arising from the Faster R-CNN classification, all but nine were removed and labelled as non-kiln, see Table 2. A price paid for this, however, was an increase of omission errors to nine. The producer's accuracy for brick kilns was, therefore, 94.94%, but the classification had both low omission and commission errors. Again, visual classification could be used to further enhance the results if required. Critically, however, the accuracy of the classification arising from the CNN applied to the output of the Faster R-CNN analysis is such that users on the ground would waste little time at mislabelled sites. In this study, some tolerance to error in the automated classification is provided, notably by the ability to usefully combine automated classification with highly targeted visual classification. A key feature in the analysis is the omission errors, sites of potential slavery activity that are missed and hence would not attract the attention of anti-slavery work on the ground. Of particular note is that none of the brick kilns were omitted in the output of the Faster R-CNN. The large commission error, however, could lead to wasteful and inefficient anti-slavery activity on the ground. The use of a second classification, whether by manual interpretation or automated, could greatly reduce the commission errors while causing only small omission errors. Critically, the results indicate that contemporary machine learning methods may be used to accurately identify sites known to be strongly associated with slavery from remote sensing images and used to help achieve UN SDGs. The next stage of our work is to refine the approaches further but also to develop a wall-to-wall map of the entire Brick Belt in a quick and efficient way to aid anti-slavery actions on the ground. Potential avenues of research focus on refinements of the network structure used in the Faster R-CNN and the additional CNN classifier, as many novel network structures have been proposed recently. Other CNN-based object detection frameworks, such as the single shot multibox detector (SSD) [36] and you only look once (YOLO) [27], could also be used for comparison.
The parameterisation of these models should also be further explored for optimal classification. In addition, expansion of the training samples, perhaps including brick kilns with different shape and appearance, together with the use of more background regions that have a similarity with brick kilns could help enhance classification accuracy and generalizability. A clear potential for accurate mapping of brick kilns over large regions exists. Moreover, with the aid of the open Landsat data archive, we aim to explore the spatio-temporal pattern in brick kiln distribution to aid understanding of the brick manufacturing industry in the region. It must, however, be stressed that these approaches focus on a proxy variable for slavery by identifying kilns and not slaves directly, and hence additional work is also required to ensure that interventions on the ground recognise that some kilns mapped may use no slave labour at all. Finally, it should be stressed that the technologies used also have a real potential to be used in the detection of other activities that are known to use slavery (forced labour), for example, mining and illegal logging [37]. Further, the technologies also provide a means to study the associated environmental improvements that should arise from addressing slavery, enabling the freedom dividend to be characterised and quantified as well as contributing to other UN SDGs (for example UN SDGs 12 and 15).

Conclusions
Information on brick kilns in the Brick Belt is required to support anti-slavery activities. Given the large geographical extent of the Brick Belt, remote sensing has considerable potential as a source of data to inform efforts to detect and map brick kilns. Previous work has shown that visual interpretations obtained via crowdsourcing allow brick kilns to be accurately mapped from fine spatial resolution satellite sensor images such as those available in Google Earth. The ability to maintain a motivated crowd for accurate mapping of large areas is, however, a challenge and the time taken to produce the maps is a potential limitation. Here, the potential to use recent advances in key technologies for mapping brick kilns was explored.
Approaches for brick kiln detection that are fully or largely automated were shown to be able to produce accurate maps of brick kilns. Specifically, a fully automated approach using the Faster R-CNN yielded a map with a producer's accuracy for kilns of 100%. The high commission error associated with this classification could be a limitation, but this could be reduced by the application of a second classifier. Given that outputs of the initial classification allow activity to be focussed on a very small proportion of the region of study, visual classification and crowdsourcing should be feasible and able to manually adjust the outputs of the original classification. Alternatively, for a fully automated mapping approach, a further digital classifier may be applied. Here, it was shown that the application of a CNN classifier to the original outputs of the Faster R-CNN yielded a highly accurate map of brick kilns, with an overall accuracy of 95.08% and a producer's accuracy for brick kilns of 94.94% that had both low omission and commission errors. These results suggest considerable potential for mapping large areas quickly and accurately, which should help support anti-slavery activity on the ground and thereby aid the achievement of the relevant UN Sustainable Development Goals.