Classification of Oil Slicks and Look-Alike Slicks: A Linear Discriminant Analysis of Microwave, Infrared, and Optical Satellite Measurements

We classify low-backscatter regions observed in Synthetic Aperture Radar (SAR) measurements of the surface of the ocean as either oil slicks or look-alike slicks (radar false targets). Our proposed classification algorithm is based on Linear Discriminant Analyses (LDAs) of RADARSAT-1 measurements (402 scenes off the southeast coast of Brazil from July 2001 to June 2003) and Meteorological-Oceanographic (MetOc) data from other earth observation sensors: Advanced Very High Resolution Radiometer (AVHRR), Sea-Viewing Wide Field-of-View Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer (MODIS), and Quick Scatterometer (QuikSCAT). Oil slicks are sea-surface expressions of exploration and production oil, shipand orphan-spills. False targets are associated with environmental phenomena, such as biogenic films, algal blooms, upwelling, low wind, or rain cells. Both categories have been interpreted by domain-experts: mineral oil (n = 350; 45.5%) and petroleum free (n = 419; 54.5%). We explore nine size variables (area, perimeter, etc.) and three types of MetOc information (sea surface temperature, chlorophyll-a, and wind speed) that describe the 769 samples analyzed. Seven attribute–domain combinations are tested with three non-linear transformations (none, cube root, log10), with and without MetOc, adding to 39 attribute subdivisions. Classification accuracies are independent of data transformation and improve when selected size attributes are combined with MetOc, leading to overall accuracies of ~80% and sound levels of sensitivity (~90%), specificity (~80%), positive (~80%) and negative (~90%) predictive values. The effectiveness of this data-driven attempt supports further commercial or academic implementation of our LDA algorithm.


Introduction
The presence and development of oil and gas exploration and production in open oceanic waters of Brazil has led to many environmental oil-related incidents over time, and two major episodes have occurred since the eve of the current millennium. In 2001, the world's largest floating offshore The first process proposes polygons with oil slicks or petroleum-free candidates (e.g., [19]), and the other two build on that. While some scientific effort has been put in to investigating non-linear techniques for discriminating polygons containing oil and those that do not (e.g., [20]), only recently have Linear Discriminant Analyses (LDAs) been employed to automatically distinguish seeps from spills (e.g., [21][22][23][24]).
Based on the seep-spill discrimination findings of [18], in this paper, we extend the methodological recommendations of [21][22][23][24] with the objective to classify regions where sea-surface backscatter in SAR measurements are low as either mineral oil slicks or other environmental petroleum-free false targets (i.e., oil vs. look-alikes). For this, we use an algorithm that exploits LDAs of a set of satellite measurements (microwave, infrared, and optical) off the southeast coast of Brazil ( Figure 1). Through the scientific settings of our study we use an existing database to seek the answers of six questions:

1.
Is a simple, linear, multivariate data analysis technique able to discriminate between oil slicks and petroleum-free slicks? 2.
Is it feasible to reach classification accuracy levels to support operational implementations (commercial or academic) of our proposed algorithm? 3.
Does the application of non-linear data transformations affect the oil and look-alike discrimination? 4.
Can the sole use of Meteorological-Oceanographic (MetOc) satellite information distinguish oil from false targets? 5.
Is there any specific combination of attributes that leads to a superior discrimination between oil slicks and slick-alikes? 6.
Is our LDA-developed algorithm applicable to other regions?
4. Can the sole use of Meteorological-Oceanographic (MetOc) satellite information distinguish oil from false targets? 5. Is there any specific combination of attributes that leads to a superior discrimination between oil slicks and slick-alikes? 6. Is our LDA-developed algorithm applicable to other regions?

Human-Dependent Operational Guidelines
The ability to discriminate between seeps and spills using the synoptic view of satellites has long been an objective at the Laboratory of Radar Remote Sensing Applied to the Petroleum Industry (LabSAR) of the Federal University of Rio de Janeiro (UFRJ, Brazil). For about two decades, LabSAR has provided a valuable tool to oil and gas operators: the most probable location of offshore petroleum systems based on satellite imagery analyses-e.g., [25]. However, these were operational projects that relied on manual approaches, i.e., dependent on human intervention. The paradigm against the widely-used manual seep-spill image inspection processes versus newly-developed automatic methods has been the focus of recent academic studies-e.g., [18]. Within this scope, a fresh take on an old, well-established problem has indeed shown its facet as described below.

Initial Automated Procedure: Carvalho
In this section we summarize past research and results of [18] who developed an automated procedure to classify sea-surface expressions of mineral oil slicks into naturally seeped oil or operational oil spills with a linear multivariate analysis technique applied to SAR measurements, i.e., LDAs applied to RADARSAT-2 measurements from the Gulf of Mexico (Campeche Bay, Mexico). While [26,27] described the Mexican dataset used in [18], the bases of the exploratory analysis of [18] are discussed in depth in [21]. A single non-linear transformation was tested and applied to the data:  The ability to discriminate between seeps and spills using the synoptic view of satellites has long been an objective at the Laboratory of Radar Remote Sensing Applied to the Petroleum Industry (LabSAR) of the Federal University of Rio de Janeiro (UFRJ, Brazil). For about two decades, LabSAR has provided a valuable tool to oil and gas operators: the most probable location of offshore petroleum systems based on satellite imagery analyses-e.g., [25]. However, these were operational projects that relied on manual approaches, i.e., dependent on human intervention. The paradigm against the widely-used manual seep-spill image inspection processes versus newly-developed automatic methods has been the focus of recent academic studies-e.g., [18]. Within this scope, a fresh take on an old, well-established problem has indeed shown its facet as described below.

Initial Automated Procedure: Carvalho
In this section we summarize past research and results of [18] who developed an automated procedure to classify sea-surface expressions of mineral oil slicks into naturally seeped oil or operational oil spills with a linear multivariate analysis technique applied to SAR measurements, i.e., LDAs applied to RADARSAT-2 measurements from the Gulf of Mexico (Campeche Bay, Mexico). While [26,27] described the Mexican dataset used in [18], the bases of the exploratory analysis of [18] are discussed in depth in [21]. A single non-linear transformation was tested and applied to the data: log 10 . Two distinct methods were used to select the most relevant variables-Correlation-Based Feature Selection (CFS; [28]) and Unweighted Pair Group Method with Arithmetic Mean dendrograms (UPGMA; [29]). The latter uses two user-defined thresholds: Pearson's r correlation coefficients of 0.5 and~0.9 [18,21]. The best overall seep-spill discrimination accuracy was about 70% with sensitivity (~80%), specificity (~75%), positive (~65%) and negative (~75%) predictive values. However, a linear transformation (Principal Component Analysis; PCA) was used to reduce the dimensionality of their selected variables, and as such, the "scores" of the relevant axes (i.e., principal components; PCs) were input into their LDAs. Additionally, by exploiting the entire attribute set, including particular contextual site-specific variables (e.g., latitude and longitude), they reached an almost faultless differentiation of 99.98%. Conversely, it was not possible to discriminate seeps from spills when the SAR-signature attributes were calculated with uncalibrated Digital Number (DN) values.
In this paper, we refer to the work of [18] and [21] simply as "Carvalho". To summarize, Carvalho has demonstrated two particularly relevant issues: 1.
The feasibility of automatically separating oil (seeps) from oil (spills) using a simple, classical, linear classification method-i.e., LDA; and 2.
The possibility of achieving an effective seep-spill discrimination exploiting two straightforwardly calculated oil slick basic morphological characteristics (area and perimeter; after using a PCA), calculated from satellite measurements. In subsequent investigation, [22,23] promoted a refinement of Carvalho's research in a more controlled manner. They applied eight non-linear transformations to the data: none (x), reciprocal (1/x), logarithm base 10 (log 10 (x)), Napierian logarithm (ln(x)), square root (x 1/2 ), square power (x 2 ), cube root (x 1/3 ), and cube power (x 3 ). Four methods were tested for selecting uncorrelated attributes based on the UPGMA, which were preferable to the automated CFS due to its user-defined capabilities: 1) no UPGMA without PCA (i.e., original correlated data); 2) no UPGMA with PCA; 3) UPGMA without PCA; and 4) UPGMA with PCA (as in Carvalho). The UPGMA in these cases used a stricter threshold (0.3 > r > −0.3) deeming variables to be uncorrelated at this level based on the number of samples [30]. The best discrimination accuracies occurred with attribute selection method #1 (but this is not valid as it uses correlated variables), then #2 (PCA directly from the original data), closely followed by #3 (UPGMA alone), with #4 (UPGMA+PCA) being the least accurate. These results showed that the sole use of dendrograms (with the strict threshold, thus eliminating the application of PCAs, as proposed by Carvalho) is sufficient to effectively discriminate seeps from spills. The best data transformations to discriminate the oil slick category are log 10 and cube root, both producing classification accuracies similar to Carvalho. Follow-up research by [24] also investigated ways to improve the LDA seep-spill classification. Variables were selected with the strict UPGMA threshold used in [22,23]. The two best non-linear transformations were compared with the original data. They showed that with no transformation applied, the discrimination was void. On the other hand, when the data were non-linearly transformed, the ability to discriminate was comparable to Carvalho, with log 10 being somewhat superior to cube root.
Together, the work reported by [22,23] and [24] is hereafter referred to as "Carvalho et al.". Their major contributions are as follows: 1.
The superiority of non-linear data transformations: log 10 and cube root; 2.

Comparing Gulf of Mexico and Campos Basin Studies
The foremost characteristics of the LDA usage that can be highlighted between these previous works and our current paper are: To further address the issues revealed in the automated LDA seep-spill discrimination, in this current paper we focus on investigating the application of such classical, linear, multivariate data analysis technique to tell apart oil slicks and look-alikes. The evolution of the concepts considered here is given below.

Study Area
The slicks investigated here (oil and look-alikes) are from a region off the southeast coast of Brazil: the Campos Basin ( Figure 1). A large number of oil and gas exploration and production facilities are located in this basin, making it a province of significant politico-economic and socio-ecological relevance [31]. Since the mid-2000s, with the discovery of supergiant reservoirs of light hydrocarbons beneath the salt layers, the Campos Basin major petroleum-related infrastructure has been improved, and its worldwide economical relevance also increased; currently, 38 operational oilfields are responsible for providing 41.5% of Brazil's oil and natural gas production: 1,373,068 barrels of equivalent oil per day [32].
The Campos Basin has a very dynamic environment that is subject to highly variable weather conditions. The South Atlantic Subtropical Anticyclone governs the large-scale atmospheric circulation pattern that keeps a sustained northeast quadrant wind in the southeastern Brazilian coast area-such a dominant wind direction, associated with the abrupt change in shoreline orientation and the occurrence of the South Atlantic Central Water, triggers strong upwelling events about the Cabo Frio and Cabo de São Tomé region northeast of Guanabara Bay (Rio de Janeiro), thus increasing the local primary biological productivity [33]. Conversely, during boreal winters, upon the incidence of intense southwest-quadrant winds associated with cold fronts, downwelling can be induced, and less biologically productive seas may also be accompanied by rough waves of up to 10 m high. A year-round mesoscale phenomenon influencing this region is the frequently observed oceanic cyclonic vortices and meanderings of the Brazilian Current [34].

Database
A comprehensive tabular dataset generated by [17] is used; it has also been exploited by [35]. Figure 2 shows the sampling distribution of the available mineral oil and petroleum-free slicks (n = 769), and illustrates the extensive range of classes of the SAR-derived low-backscatter regions. The fossil fuel pollution records (n = 350; 45.5%) correspond to the sea-surface expression of a variety of petroleum-slick sources: mineral oil from known exploration and production installations, ship-and orphan-spills-the latter refers to confirmed oil slick cases from unidentified sources. The radar false target instances (n = 419; 54.5%) are associated with an assortment of environmental petroleum-free phenomena: biogenic films, algal blooms, upwelling, low wind speeds, or rain cells. This class diversity is a relevant aspect, especially because of the highly dynamic MetOc characteristics of the Campos Basin [33,34]. All records of both categories (oil and look-alikes) are the decisions of trained personnel who are specialists in interpreting satellite imagery in this area. Auxiliary MetOc data have been used to help corroborate the domain experts' interpretations.

RADARSAT-1
This database is comprised of 402 RADARSAT-1 scenes recorded at 8-bit resolution (transmitted and received at horizontal polarization; HH) that have been collected over two-years, from July of 2001 to June of 2003. These are path-oriented images from three beam modes: ScanSAR Narrow A (SCNA), ScanSAR Narrow B (SCNB), and Extended Low 1 (EXTL1) [36]. The ground resolution of the available imagery has been re-sampled to 100 m to improve the segmentation process [17].

Stages to Detect Oil and Look-Alikes in Satellite Imagery
This satellite database was built in three stages [17]. In the first stage, the remote sensing images containing potential oil and look-alike candidates were selected. RADARSAT-1 imagery was analyzed in conjunction with contextual conditions-i.e., concurrent meteo-oceanographic ancillary data (see Section 2.2.2.2). Radar images were pre-processed for spatial and radiometric corrections.

Database
A comprehensive tabular dataset generated by [17] is used; it has also been exploited by [35]. Figure 2 shows the sampling distribution of the available mineral oil and petroleum-free slicks (n = 769), and illustrates the extensive range of classes of the SAR-derived low-backscatter regions. The fossil fuel pollution records (n = 350; 45.5%) correspond to the sea-surface expression of a variety of petroleum-slick sources: mineral oil from known exploration and production installations, ship-and orphan-spills-the latter refers to confirmed oil slick cases from unidentified sources. The radar false target instances (n = 419; 54.5%) are associated with an assortment of environmental petroleum-free phenomena: biogenic films, algal blooms, upwelling, low wind speeds, or rain cells. This class diversity is a relevant aspect, especially because of the highly dynamic MetOc characteristics of the Campos Basin [33,34]. All records of both categories (oil and look-alikes) are the decisions of trained personnel who are specialists in interpreting satellite imagery in this area. Auxiliary MetOc data have been used to help corroborate the domain experts' interpretations.

RADARSAT-1
This database is comprised of 402 RADARSAT-1 scenes recorded at 8-bit resolution (transmitted and received at horizontal polarization; HH) that have been collected over two-years, from July of 2001 to June of 2003. These are path-oriented images from three beam modes: ScanSAR Narrow A (SCNA), ScanSAR Narrow B (SCNB), and Extended Low 1 (EXTL1) [36]. The ground resolution of the available imagery has been re-sampled to 100 m to improve the segmentation process [17].

Stages to Detect Oil and Look-Alikes in Satellite Imagery
This satellite database was built in three stages [17]. In the first stage, the remote sensing images containing potential oil and look-alike candidates were selected. RADARSAT-1 imagery was analyzed in conjunction with contextual conditions-i.e., concurrent meteo-oceanographic ancillary data (see Section 2.2.2.2). Radar images were pre-processed for spatial and radiometric corrections.  [17]. The available SAR-derived targets are divided in two categories: mineral oil slicks and other environmental phenomena (non-petroleum signals)-the latter is frequently referred to as radar false targets or "slick-alikes". The respective classes of each category are also shown. Sampling characteristics of the database that contains information from regions with low Synthetic Aperture Radar (SAR) backscatter observed on the surface of the ocean [17]. The available SAR-derived targets are divided in two categories: mineral oil slicks and other environmental phenomena (non-petroleum signals)-the latter is frequently referred to as radar false targets or "slick-alikes". The respective classes of each category are also shown.
The second database construction stage consisted of an image segmentation procedure performed using a multiple resolution segmentation approach [37,38] to identify the borders of the polygons containing low-backscatter radar signals.
The third stage defined, and computed, the attributes describing the individualized targets that came out of the segmentation. Several representative attributes of different types were calculated for each identified polygon. Firstly, these types were divided into SAR-signature, textural, geolocation, and SAR-scene. The four SAR-signature attributes (e.g., coefficient of variation: ratio between standard deviation and mean) and two textural variables (i.e., contrast and entropy) were calculated from uncalibrated measures-i.e., DNs which express the backscatter count of the pixels of each scene: 0 to 255 for 8-bit images [39]. There were twelve site-specific location attributes (e.g., bathymetry, target distance from the coast and from platforms, etc.) and three SAR scene-related attributes (e.g., number of identified targets per scene). Secondly, two other attribute types were also considered: those related to the morphological characteristics of the segmented polygons and those representing the observed contextual conditions-these are both explained in the sections that follow.

Geometry, Shape, and Dimension Variables
A set of basic morphological attributes describing the SAR-derived polygons (oil and look-alikes) included area, perimeter (Per), shape index (SHP=(Per/4).(Area 1/2 )), compact index (CMP=(4.π.Area)/(Per 2 )), asymmetry (ASY=1-(W/L)), length-to-width ratio (LtoW=L/W), density (DEN=(n 1/2 )/(1+(var(x)+var(y)) 1/2 )), curvature (CUR), and number of parts of each target (NUM); in which W and L are the width and length of the polygons, n is the number of pixels in the identified target, and var(x) and var(y) are the variances in x and y (longitude and latitude, respectively), both calculated with the covariance matrix of the number of pixels. CUR is the sum of the variations of a principal imaginary line direction equidistant to the longest side of the analyzed polygon, expressed in degrees [17]. Further details on these attributes are found in [40]. Hereafter, the geometry, shape, and dimension features are referred to as size information.

Meteorological and Oceanographic (MetOc) Information
The database includes five MetOc variables: sea surface temperature (SST), concentration of chlorophyll-a (CHL), wind (speed and direction), and clouds (presence or absence). The SST magnitude was retrieved from AVHRR onboard the National Oceanic and Atmospheric Administration (NOAA) series satellites (12, 14, 15, and 16) and calculated with the Non-Linear SST (NLSST) algorithm [41]. The CHL magnitude was retrieved from either SeaWiFS (onboard the OrbView-2 satellite) or MODIS (onboard the Terra satellite), both calculated with the global Ocean Color 4 (OC4) algorithm [42]. The magnitude of the wind field was obtained from the SeaWinds scatterometer flying on the Quick Scatterometer (QuikSCAT) satellite with a demonstrated accuracy of <2 m/s and 20 • [43]-whenever available, these were cross-validated with in situ wind measurements from local offshore faculties. The occurrence of clouds over the polygons was obtained from the SST maps. While the nominal spatial resolution of SST and CHL values is~1 km at the centre of the swath, the wind data have ã 25 km footprint.
The MetOc information was used in two stages of the target identification process (see Section 2.2.2): in the first stage to assist in the image selection (as environmental contextual charts) and in the third stage as contextual attributes expressing the observed targets' characteristics. In the latter, SST, CHL, and wind speed (WND) were catalogued in three forms: a more intuitive form, i.e., the average value within the polygons' limits, and two other forms calculated using the inside and outside (20 km buffer zone) averaged values: the difference and ratio between in and out. The presence (1) or absence (0) of clouds was registered as discrete records.

Research Strategy
A pictorial view of the research strategy explored to develop and evaluate our LDA algorithms is shown in Figure 3-quality control (QC), attribute-domain subdivisions, data transformations, feature selection, LDAs, and accuracy assessment. An open-access software package was used in our data mining exercises: PAST (PAleontological STatistics; [44,45]).

Phase 1: Quality Control (QC)
At the start, to certify that the database met certain effective conditions to accomplish the most accurate possible discrimination, we performed what we refer to as QC-standards: 1.
Verification of the reliability of the database records after data inconsistencies, i.e., removal of any sort of errors-for example, instances with missing value for any given attribute, obvious outliers, noisy data, etc.; 2.
Valuation of the attribute types to their suitability for our purposes; and 3.
Inspection of correlation matrices to avoid inter-correlation, as LDAs require the smallest correlation among the candidate variables [46].

Phase 2: Attribute-Domain Subdivisions
As in the seep-spill LDA differentiation discussed in Section 1.1, we followed the same pathways to investigate if there were combinations of variables that better discriminated oil from look-alikes. As such, after performing the QC's, we divided the attribute set into various, small, specific subdivision domains based on the previous experiences of Carvalho [18,21]

Research Strategy
A pictorial view of the research strategy explored to develop and evaluate our LDA algorithms is shown in Figure 3-quality control (QC), attribute-domain subdivisions, data transformations, feature selection, LDAs, and accuracy assessment. An open-access software package was used in our data mining exercises: PAST (PAleontological STatistics; [44,45]).

Phase 1: Quality Control (QC)
At the start, to certify that the database met certain effective conditions to accomplish the most accurate possible discrimination, we performed what we refer to as QC-standards: 1. Verification of the reliability of the database records after data inconsistencies, i.e., removal of any sort of errors-for example, instances with missing value for any given attribute, obvious outliers, noisy data, etc.; 2. Valuation of the attribute types to their suitability for our purposes; and 3. Inspection of correlation matrices to avoid inter-correlation, as LDAs require the smallest correlation among the candidate variables [46].

Phase 2: Attribute-Domain Subdivisions
As in the seep-spill LDA differentiation discussed in Section 1.1, we followed the same pathways to investigate if there were combinations of variables that better discriminated oil from look-alikes. As such, after performing the QC's, we divided the attribute set into various, small, specific subdivision domains based on the previous experiences of Carvalho [18,21] (Section 1.1.2), Carvalho et al. [22][23][24] (Section 1.1.3), and [17] (Section 2.2.2.1). Likewise, to inspect the influence of the MetOc information in this process, we performed separate analyses with and without the MetOc data. Research strategy for the evaluation of linear multivariate analysis algorithms aimed at classifying information from a dataset of SAR-derived, low-backscatter regions into mineral oil slicks or other environmental look-alike targets (non-petroleum signals). The six phases are described in the text, Sections 2.3.1 to 2.3.6. "Carvalho" refers to [18,21], see Section 1.1.2. "Carvalho et al." corresponds to [22][23][24], see Section 1.1.3. "Bentz" is associated with [17], see Section 2.2.2.1. Research strategy for the evaluation of linear multivariate analysis algorithms aimed at classifying information from a dataset of SAR-derived, low-backscatter regions into mineral oil slicks or other environmental look-alike targets (non-petroleum signals). The six phases are described in the text, Sections 2.3.1-2.3.6. "Carvalho" refers to [18,21], see Section 1.1.2. "Carvalho et al." corresponds to [22][23][24], see Section 1.1.3. "Bentz" is associated with [17], see Section 2.2.2.1.

Phase 3: Data Transformations
Carvalho et al. demonstrated that the LDA ability to discriminate oil (seeps) from oil (spills) is positively influenced by the application of non-linear transformations, i.e., cube root and log 10 . Here, we compared the ability to distinguish oil slicks from slick-alikes using the original Campos Basin data with and without applying the two best data transformations they reported. This was done in all subdivisions defined in Phase 2.

Phase 4: Feature Selection
Commonly referred to as "feature engineering", in which relevant attributes are selected to be applied in the classification system, this process also reduces the attribute dimensionality [47]. Hence, our feature selection consisted in the analyses of UPGMA dendrograms, separately carried out on each attribute-domain combination (Phase 2) in all data transformations (Phase 3). The interpretation of dendrograms is very simple. The level of which uncorrelated variables are selected is subjectively defined by the user. Visual analyses are a common practice, but generally, horizontal lines drawn across the dendrograms are used to form groups of correlated variables from which only one is selected to represent each group, ensuring there is no correlation among the selected variables-such lines are called phenon lines and are user-defined similarity cut-offs [48]. Here, to use as few correlated variables as possible in the LDA [46], we applied Pearson's r correlation coefficients to define the level from which uncorrelated variables were selected: 0.3 > r > −0.3-see Section 1.1.3 [22][23][24].

Phase 5: Linear Discriminant Analyses (LDAs)
Because of the promising use of a linear, parametric, multivariate analysis method to automatically discriminate seeps from spills, as discussed above in Section 1.1, we also used LDAs to design an algorithm to identify two distinct categories: oil slicks vs. slick-alikes. LDAs have two main prerequisites: • The candidate variables must have the least possible inter-correlation [46]-this has been addressed above (Phases 1 and 4); and • The data must contain dichotomy information (in our case, oil and look-alikes) that is used to reach (and corroborate) the models' classification accuracy-this is dealt with below (Phase 6), and indeed, these mutually exclusive a priori known labels are used to fine-tune our supervised learning application [49].

Phase 6: Accuracy Assessment
The LDAs performed in Phase 5 were individually evaluated with all 769 targets in the database of oil and look-alike slicks (Figure 2). By not withholding samples for a separate test set, we were able to obtain the best quality of circumstances to reach the least out-of-sample errors. Yet, utilizing all samples to train the classification model, the risk is incurred of having high training errors (i.e., our classification misidentifies too many targets), hence deeming our algorithms null and void. On the other hand, if obtaining low overall accuracy errors (i.e., our classification strikes most samples of both categories correctly), our model is successful.
The accuracy assessment of classification algorithms in data science investigations is generally quantified using confusion matrices, i.e., two-by-two tables [50]. In our matrices, the reference data are in the horizontal and the classified data in the vertical-in Table 1, rows are the a priori known classification and columns are the model outcome. A common metric to assess the correct classification of both categories is the overall accuracy, expressed as a percent. It is calculated by adding the diagonal elements of Table 1-i.e., correctly classified oil slicks (A) and correctly classified look-alikes (D)-then dividing it by the total number of samples; 769 in our case.
Nevertheless, the use of this metric alone may give the wrong impression about the true reliability of the algorithm [51][52][53]. This can be avoided by evoking supplementary statistical measures which are calculated from "horizontal" (Table 2) and "vertical" (Table 3) analyses of the confusion matrix ( Table 1). The information given by these associated metrics is important to estimate how appropriate our discrimination models are. We chose to split the information in a separate schema to facilitate the comprehension of such metrics-see Tables 1-3. From Table 2 we obtain sensitivity and specificity, as well as their counterparts: false negatives and false positives. These inform how well the a priori known samples are classified (producer's accuracy) and how badly the a priori known samples are misclassified (omission error or Type I error). Table 3 shows the positive and negative predictive values and their complements: inverse of the positive and negative predictive values. These report how well the models classify the actual samples (user's accuracy) and how bad the algorithms misinterpret them (commission error or Type II error).
Because we are exploring several attribute-domain combinations (Phase 2), we represent our accuracy assessment in a "condensed" two-by-two cross-tabulation form- Table 4. This discloses in a single table the main metrics shown in Table 2 (sensitivity and specificity) and in Table 3 (positive and negative predictive values), along with the overall accuracy. Table 4 also provides a simplified, comparable-fashion presentation of the across-subdivision accuracy results of the classification algorithms.  Table 2. "Horizontal" analysis of the confusion matrix shown in Table 1 with some of the supplementary measures used to evaluate our Linear Discriminant Analyses (LDAs).

Known oil slicks Sensitivity
False negative 100%

Known look-alikes
False positive Specificity 100% Table 3. "Vertical" analysis of the confusion matrix shown in Table 1 with some of the associated metrics used to evaluate our Linear Discriminant Analyses (LDAs).

Known look-alikes
Inverse of the pos. pred. val.

Negative predictive value
All LDA targets 100% 100% Table 4. "Condensed" form of the confusion matrix shown in Table 1 used to access the classification accuracy of our Linear Discriminant Analyses (LDAs). See also Tables 2 and 3.

QC-Standards
In the first QC-standard, we identified ten data records having some inconsistency, most likely from typos: eight oil slicks and two slick-alike targets. These instances were removed from subsequent analysis. Consequently, after completing this first QC, the database has 769 targets: 350 oil slicks (45.5%) and 419 look-alike slicks (54.5%)- Figure 2.
The second QC-standard considered the utility of the attribute types describing the identified targets. Accordingly, because the values of the SAR-signature and textural information were calculated and registered in uncalibrated DNs, these attributes are not explored further here. The use of DNs for an analysis of measurement time series may mask important relationships, which may become more apparent by using calibrated measurements [18]. The attributes of location are also not employed in this investigation, as we intend to develop an algorithm that can be applied anywhere, and such site-specific variables cannot be transferred from one region to another. In addition, scene-related attributes are not included. Furthermore, due to the binary character of the cloud data (1 or 0), this MetOc descriptor is not considered here. After the application of this second QC, several irrelevant attribute types have been discarded, leaving only two attribute types to be carried forward: size information (Section 2.2.2.1) and contextual MetOc conditions (Section 2.2.2.2).
The inspection of the correlation matrices, the third QC-standard, revealed that some size variables are inter-correlated: SHP (shape index) with CMP (compact index), and ASY (asymmetry) with LtoW (length-to-width ratio). Authors in [22,23] also observed in the seep-spill dataset that SHP and CMP had an equal but inverted frequency distribution. From these four attributes, only two, CMP and LtoW, are used due to their simplicity. Additionally, based on earlier results [24], we have included two other size variables: PtoA and FRA. Therefore, based on the available variables within the database (Section 2.2.2.1; [17]) and on the LDA legacy left by [18,[21][22][23][24] on their seep-spill discrimination, a specific set of nine size variables are used as follows:
NUM: number of parts of each target.
The correlation matrices also confirmed inter-correlation among the three MetOc forms, i.e., the average values inside the polygons are correlated with the difference and ratio between the inside and outside of the polygons. As a result, only the more intuitive magnitude of the averaged values from inside the targets were retained:
As such, the application of this third QC led to the initial data analyses using twelve descriptors: nine size attributes and three MetOc variables.

Attribute-Domain Subdivisions
The nine size variables determined by the QC's were initially analyzed together; these are named "All size information". They were then divided in different subdivisions grouped based on the earlier results of Carvalho [18,21] and Carvalho et al. [22][23][24] (Sections 1.1.2 and 1.1.3, respectively), as well as on the variables previously given in [17]-the latter is simply referred to as "Bentz" (Section 2. Additionally, all subdivisions were separately analyzed with and without the MetOc variables. As combinations in the attribute domain are analyzed with and without MetOc, as well as with the application of the three data transformations, there are 39 attribute subdivisions.

MetOc-Only
None @ @ @ 3 out of 3 Cube root @ @ @ 3 out of 3 log 10 @ @ @ 3 out of 3 A noteworthy characteristic of some variables shown in Figure 4 is that some variables are correlated (r > 0.3): CMP (compact index) with DEN (density), and PtoA (perimeter-to-area ratio) with CUR (curvature). From these four variables two were selected based on their simplicity: CMP and PtoA. These relationships similarly occur in the other subdivisions. Additionally, as in Carvalho's seep-spill exercise, Area and perimeter (Per) are correlated here too, and from the two, we chose to retain Area. It is worth mentioning that in Carvalho, this pair of correlated morphological features had undergone a PCA before the values were input into their LDAs, i.e., PC scores instead of actual values. Figure 4 (top panel: original data; and middle panel: cube root) indicates that of twelve attributes, nine are deemed uncorrelated (+); therefore, these were selected for input to the LDA for this subdivision: Area, PtoA, CMP, FRA, LtoW, NUM, SST, CHL, and WND; see also Table 5. The three eliminated variables are marked with a dot: Per, DEN, and CUR. These three correlated variables are redundant for the purposes of using LDAs as they do not bring independent information. A remarkable aspect about the log 10 transformation (Figure 4: bottom panel) is that when it is applied, only ten variables are included in this subdivision, from which eight are selected: + or @. This is because FRA and CUR may have negative values and, thus, cannot be accounted with this transformation; some subdivisions do not consider these two variables: Carvalho and MetOc-Only (Table 5). Table 5 presents the variables selected with the UPGMA dendrograms for the 39 attribute subdivision domains. Four main aspects are apparent in this table: • There is a considerable reduction in the attribute dimensionality in all combinations of attributes;

Dendrogram Visual Inspection
Notwithstanding the use of phenon lines, the visual analyses of our UPGMA dendrograms usually reveal that specific groups of variables are formed independent of data transformation, see Figure 4 (these are color-coded: purple, brown, and yellow). Nevertheless, these visually-combined variables should not be confused with those selected with the similarity lines: 0.3 > r > −0.3 (Table 5). In fact, such visual grouping of attributes is not critical to this analysis, but this comes to prominence because these color-groups show some unusual relationships among the attributes. The groups are: Minor variations are observed in these groupings across the other attribute-domain combinations. These visually-identified groups of variables are linked to each other at levels close to zero similarity (r~0), meaning that there is almost no inter-group correlation (Figure 4). Table 6 presents the classification accuracies of the 35 different LDA-based algorithms; these are ordered by the results of the associated statistical metrics shown in Table 4-i.e., overall accuracy (diagonal analysis of Table 1), sensitivity and specificity (horizontal analysis of Table 2, producer's accuracy), and positive and negative predictive values (vertical analysis of Table 3, user's accuracy). Because we have 769 targets, the discretization interval of our analyses is 0.13%, i.e., 1/769.  The best discrimination uses Bentz (LtoW, DEN, and NUM) with Carvalho (Area) with MetOc (SST, CHL, and WND) with log 10 attribute subdivision (Table 6). A successful overall discrimination accuracy of 83.7% is observed when these seven descriptors are analyzed together: 644 samples are correctly identified (316 oil slicks and 328 slick-alikes: sensitivity of 90.3% and a specificity of 78.3%, with good levels of positive (77.6%) and negative (90.6%) predictive values). On the other hand, the least accurate attribute subdivision is Bentz (DEN and NUM) without MetOc with log 10 transformation ( Table 6). The overall accuracy achieved when only these two attributers are used is as low as 67.8% (521 samples correctly identified: 248 oil slicks and 273 look-alikes) with sensitivity (70.9%), specificity (65.2%), and positive (62.9%) and negative (72.8%) predictive values.

Accuracy Assessment
Another notable characteristic observed in Table 6 is that there are four main hierarchy blocks been formed with similar attribute-domain combinations as a function of attribute types (i.e., size information with or without MetOc variables, as well as MetOc by itself): • The top seventeen ranks from the subdivisions with MetOc; • Eight ranks from the subdivisions without MetOc; • The three MetOc-Only subdivisions, and another Carvalho subdivision (Area) with the three MetOc variables and no transformation (hierarchy #28 of Table 6); and • The remaining six subdivisions without MetOc.
These results show the synergy that occurs whenever size variables are analyzed together with the MetOc information (1st hierarchy block of Table 6). It is noteworthy the superiority of some subdivisions that only account for the size variables without MetOc (2nd hierarchy block) over the sole use of the MetOc variables (3rd hierarchy block, i.e., MetOc-Only). Table 7 (top) presents the typical values of the hierarchy blocks: mean, maximum, minimum, and standard deviation values. Again, the synergy of using size and MetOc simultaneously is observed in all given metrics. The averaged overall accuracies are: 81.4%, 78.5%, 76.9%, and 71.2%, respectively, for the four blocks. Likewise, the other associated statistical measures also follow this top-down sequence.   Table 7 (bottom) reveals the absence of a direct benefit of applying non-linear transformations. In the top two blocks, there is a similar representativeness of all transformations (~30%), and in the lower two blocks the original data accounts for 50% of each. Furthermore, Table 6 reveals that there is no clear pattern in the ability of the LDA to discriminate between oil slicks and slick-alikes involving data transformations-both the top (83.7%) and worst (67.8%) overall accuracies are achieved with the same log 10 transformation.

Discussion
The knowledge gained from Carvalho [18,21] (Section 1.1.2) and Carvalho et al. [22][23][24] (Section 1.1.3) on the use of LDAs led us to apply such linear techniques in this study (Figure 3). A three-fold correspondence (similarities vs. differences) can be drawn between the earlier investigation and this study: Distinct categories of targets can be analyzed: the earlier studies were directed at the classification of mineral oil slick products (oil seeps vs. oil spills), but here the focus is on differentiating two types of low radar backscatter signals (oil slicks vs. slick-alikes); 2.
Different SAR dual co-polarizations measurements can be exploited: their SAR-derived smooth texture polygons were digitally classified with VV-polarized, 16-bit scenes (RADARSAT-2), but the database in this study was derived from HH-polarized, 8-bit imagery (RADARSAT-1); and 3.
Samples can come from different geographic places: the seep-spill effective discrimination was accomplished with oil slicks observed in the Gulf of Mexico, whereas here we analyzed targets from the offshore southeastern Brazilian coast ( Figure 1).
Despite the success of linear discriminant multivariate analyses in these two domains-i.e., to separate oil from oil (e.g., [18]) and oil from look-alikes-one should bear in mind complementary non-linear machine learning models [54].
Additionally, there are three relevant aspects of the database used here: • It includes interpretations by experts that have been supported by ancillary MetOc data [17]. The accuracy assessment of the LDA algorithms is compared to these man-made interpretations; • This study used RADARSAT-1 data simply because a tabular database was available. The use of ship-based multi-band radars (e.g., X-/C-/S-band [55]) or a finer-resolution C-band SAR sensor (e.g., Sentinel-1s [56]) may result in more detailed analyses of small marine slicks; and • The 402 scenes were sampled at about four images per week (between July 2001 and June 2003), thus registering the extremely high MetOc variability of the Campos Basin, and providing a large and quite well-balanced class distribution ( Figure 2) of 350 petroleum pollution records (exploration and production oil, ship-and orphan-spills) versus 419 non-petroleum targets (biogenic films, algal blooms, upwelling, low wind, or rain cells). This sampling rate ensured that a wide range of conditions of various factors influencing the detection of oil slicks in SAR imagery (e.g., sea conditions, SAR noise floor, incidence angle, etc.; such aspects were not directly measured) were well represented.
As a result, this data representativeness ensures the database used is appropriate to train algorithms, thus supporting the investigation of a worldwide, economically relevant offshore region with major oil and gas resources, the Campos Basin, with known oil slick occurrence.
The QC standards guaranteed effective criteria to promote the discrimination between oil slicks and non-petroleum signals. Some attribute types (e.g., SAR-signature and textural information) were eliminated from this study because they were provided in uncalibrated DNs, which were not converted to backscatter coefficients (gamma-, beta-, or sigma-naught) given in amplitude or decibels [57]. Notwithstanding that Carvalho and Carvalho et al. showed the sole use of size information is sufficient to discriminate seeps from spills, their results were slightly improved when size and SAR descriptors were combined. Thus, the inclusion of SAR-signature and textural information given in terms of backscatter coefficients could imply further developments to our LDA discrimination process.
When Carvalho included site-specific attributes-latitude, longitude, and others-the discrimination was considerably improved to almost 100% accuracy. Here, location was not used as a parameter in the analysis so that a set of attributes and related algorithms could be derived suitable for Remote Sens. 2020, 12, 2078 20 of 24 application to signals in any area. However, in the development of an algorithm intended for a given region, the inclusion of location descriptors may be beneficial.
From the best 17 combinations of attributes (1st block in Table 6: size with MetOc variables), there is a difference in accuracy of 4.6% from the 1st to the 17th rank (644-608 = 36 correctly identified targets; Tables 6 and 7). This means that the analyses of fixed specific subdivision domains could possibly be further developed with a one-to-one attribute substitution, i.e., having as many subdivisions as the number of possible combinations of variables, thus measuring the individual relevance per attribute. With such a procedure, a finer sense of which attribute combination best discriminates oil slicks from petroleum-free targets could be derived.
Our classification results are independent of the data transformation-i.e., original data, cube root, and log 10 ( Tables 6 and 7). Nevertheless, other non-linear transformations may result in improvements in the LDA oil and look-alike discrimination with, for instance, reciprocal, square root, square power, or cube power. Carvalho et al. tested these transformations, along with cube root and log 10 , to find that the latter two achieved improved seep-spill discrimination.
To fulfill the LDA prerequisite of having the least correlation [46], our feature selection processes used UPGMA dendrograms with the similarity cut-off of 0.3 > r > −0.3 (Section 2.3.4: Phase 4). Nonetheless, visual inspections of dendrograms could be used instead (Section 3.3.1). In Figure 4 (any panel) three main groups of variables are formed with almost no inter-group correlation. These visually combined, uncorrelated groups of variables could be used to select one attribute from each group instead of using a fixed phenon line-for instance, in Figure 4, one could choose CHL from the purple group, CMP from the brown group, and SST from the yellow group. This would further trim the dimensionality, as instead of using nine variables out of the initial twelve (Table 5), only three attributes would be input into the LDA.
It is noteworthy that the three least accurate combinations (Bentz without MetOc in all transformations) are those using the four most complex of the nine size variables, i.e., LtoW, DEN, CUR, and NUM (Table 6). While this last variable can be simply achieved by counting the number of parts of each low backscatter SAR target, the other three require more complicated calculations than the other five size explored variables, i.e., Area and Per (Carvalho), along with PtoA, CMP, and FRA (Carvalho et al.). The latter three attributes are straightforward to derive from the first two, i.e., the most basic morphological characteristics of the polygons. This demonstrates that simple descriptors can result in successful oil and look-alike discrimination, as was also found by Carvalho and Carvalho et al. while discriminating seeps from spills.
The interplay between size and MetOc variables observed on the accuracy assessment results in four hierarchy blocks (Table 6). Table 7 shows that, on average, even the attribute-domain combinations of the least accurate hierarchy block upheld practical accuracies of about 70% in all of the metrics, meaning that they can still be considered useful algorithms.

Conclusions
The discrimination of two categories of low-backscatter regions derived from Synthetic Aperture Radar (SAR) measurements (i.e., mineral oil slicks and other environmental petroleum-free false targets-oil vs. look-alikes) has been demonstrated. These two low-backscatter categories have been distinguished with simple, parametric Linear Discriminant Analyses (LDAs) applied to a set of satellite measurements (microwave, infrared, and optical) from RADARSAT-1, AVHRR/NOAA, SeaWiFS/Orbiview-2, MODIS/Terra, and SeaWinds/QuikSCAT. The study region, the Campos Basin (Figure 1), is located off the southeast coast of Brazil, and our database consists of 769 samples of oil slicks (n = 350; 45.5%) and slick-alikes (n = 419; 54.5%) derived from 402 RADARSAT-1 scenes from July 2001 to June 2003 ( Figure 2). The LDA algorithms were evaluated with a three-fold statistical metric: overall, producer's and user's accuracies (Tables 1-4). The investigation plan (Figure 3) involved the evaluation of 39 attribute subdivisions based on the knowledge gained from the earlier seep-spill discrimination findings of "Carvalho" [18,21] (Section 1.1.2), "Carvalho et al." [22][23][24] (Section 1.1.3),