1. Introduction
1.1. Background
O&G, as cardinal pillars of the global energy system play a crucial role in maintaining economic stability and fostering social progress. According to the International Energy Agency (IEA), global oil demand will still reach 106 million barrels per day by the end of this decade, despite the increasing use of renewable energy [
1]. Against this backdrop, rational development of O&G resources and efficient management of O&G facilities (
Figure 1) are essential for attaining sustainable development and tackling energy security challenges. Therefore, the precise and effective identification of O&G facilities becomes a critical imperative for the energy industry. Traditional methods for detecting such facilities rely heavily on field surveys and manual interpretation. Although these methods can achieve substantial accuracy, they suffer from drawbacks such as restricted coverage, high cost, and low efficiency, failing to fulfill the demands of large-scale O&G resource management.
Remote sensing satellite technology and artificial intelligence algorithms have steadily advanced in recent years. With their versatility, these cutting-edge technologies transcend industry barriers. They not only revolutionize the detection methods in the O&G sector but also pave the way for new applications in related fields such as ships and buildings, demonstrating particularly outstanding potential. Building on this trend, Xiong et al. [
2] developed an object detection algorithm based on subtask attention. They split the object detection task into multiple subtasks by constructing subtask-specific attention modules, thereby improving feature extraction for small targets and detection performance in remote sensing images. Ramachandran et al. [
3] created a deep learning framework to identify O&G well pads and storage tanks in satellite imagery, employing RetinaNet with Residual Network 50 (ResNet-50) and EfficientNet-B3 in a two-stage process for precise well pad detection, and Faster R-CNN with Res2Net for robust storage tank identification. Validated across multiple basins, this approach demonstrated excellent performance and strong generalization capabilities. On a related front, Zhang et al. [
4] proposed the PARE-YOLO algorithm, redesigning the neck network of YOLOv8 and incorporating a lightweight detection head with multi-scale attention fusion, which significantly improved the robustness of detecting ground objects in complex aerial scenarios. Sun et al. [
5] put forward DKETFormer, an innovative approach that leveraged a Transformer backbone to encode global dependencies. Simultaneously, it employed the Cross-spatial Knowledge Extraction Module (CKEM) and Inter-layer Feature Transfer Module (IFTM) for discriminative feature extraction and transfer, far outperforming traditional CNN-based methods.
Although the studies above collectively demonstrate substantial advances in remote sensing object detection, they also reveal a persistent gap between general-purpose detection frameworks and the specialized requirements of O&G facility recognition. The integration of remote sensing imagery and deep learning showcases remarkable potential in O&G facility detection, yielding globally prominent application achievements. The extensive utilization of high-resolution satellite imagery, coupled with continuous advancements in deep learning algorithms, substantially bolsters detection accuracy and environmental adaptability [
6], enabling the effective identification of complex ground targets under diverse geomorphological and climatic conditions, thereby providing critical support for O&G infrastructure monitoring. Current research on O&G facility detection predominantly uses horizontal bounding-box object detectors, yet the inherent angular characteristics of facilities such as well sites make it difficult for such approaches to precisely capture their geometric configurations, often compromising localization accuracy. Furthermore, a pronounced imbalance exists in the quantities of various O&G facilities globally [
7,
8], skewing model training toward prevalent categories, undermining detection performance for less abundant or smaller targets. Hampered by these constraints, detection frameworks struggle to maintain robustness, and their functionality in comprehensive surveillance and environmental assessments is curtailed. In this context, recent transformer-based oriented detectors demonstrate strong capability in modeling global dependencies, but their reliance on data-hungry self-attention and high computational complexity limits their effectiveness under the challenging conditions of O&G facility detection. To address the aforementioned deficiencies, this paper proposes OGF Oriented R-CNN, a novel oriented detection model specifically tailored for O&G facilities in high-resolution remote sensing images.
The main contributions of this article are as follows:
Development of OGF Oriented R-CNN, a reliable detection model designed to accommodate the inherent class imbalance, scale disparity, and rotational variance in O&G facility detection, yielding accurate results in high-resolution remote sensing images.
Creation of a rotated-bounding-box annotated dataset for O&G facilities, serving as a benchmark for oriented object detection and enabling future research.
1.2. Related Work
1.2.1. Object Detection Algorithms Based on Deep Learning
Object detection in remote sensing images, by leveraging the powerful feature extraction and pattern recognition aptitudes of deep learning, markedly boosts the exactitude and efficiency of image analysis, particularly in handling high-dimensional and complex datasets. Research often distinguishes between horizontal bounding box detection, which is centered around identifying regular targets, and oriented bounding box detection, which is designed to handle the geometric complexities of oriented objects. Both propel the progress of remote sensing object detection, thus meeting a wide range of requirements.
Early investigations in this field predominantly utilized horizontal bounding box methods. Pei et al. [
9] introduced SGD-YOLOv5, a refined YOLOv5 model with a depth-to-space convolution module, a global attention mechanism, and a decoupled head to improve performance on public datasets. Proposed by Li et al. [
10], the two-stage Coarse-to-Fine Decoupling (CFD) R-CNN, characterized by its coarse-to-fine decoupling strategy, feature map upsampling, and high-resolution cropping, optimized small object detection efficiency while maintaining low computational overhead. Cao et al. [
11] proposed Remote Sensing Detection Transformer (RS-DETR) augmented by Global Attention Mechanism (GAM) and Scale-invariant Intersection over Union (SIoU) loss, which melds a Swin Transformer with a dual-branch design, refining small object precision and dense cluster localization in high-resolution remote sensing.
Unlike horizontal bounding box methods, which may falter with oriented targets, oriented object detection enhances localization by explicitly accounting for object orientation. Han et al. [
12] presented Rotation-equivariant Detector (ReDet) with a rotation-equivariant network and a Rotation-invariant RoI Align (RiRoI Align) module, achieving robust detection of arbitrarily oriented objects by extracting rotation-invariant features from equivariant representations in both spatial and orientation dimensions, particularly effective for ship detection in datasets like HRSC2016. Given CNNs’ limitations in handling orientation variations, Zheng et al. [
13] developed the instance-aware Spatial-Frequency Feature Fusion detector (SFFD) for oriented object detection in remote sensing, which innovatively combined a Layer-wise Frequency-domain Analysis (L-FDA) module with CNNs, followed by an instance-aware Cross-Feature Fusion (CFF) module, validated to substantially bolster detection performance. Zhao et al. [
14] developed OrientedFormer, an end-to-end transformer detector that incorporated specialized positional encoding and attention mechanisms to manage multi-directional objects in aerial imagery.
Drawing on recent anchor-free approaches to oriented object localization, Li et al. [
15] proposed Feature Augmentation and Alignment for Anchor-free Oriented Object Detection (FAA-Det), which employs a Feature Augmentation Module (FAM) and an Oriented Feature Alignment (OFA) module to enrich target representations and harmonize classification and regression tasks, adeptly managing dense and multi-scale remote sensing scenarios with excellent results.
1.2.2. Multi-Source High-Resolution Satellite Imagery for Object Detection
Multi-source high-resolution satellite imagery, distinguished by sub-meter resolution and a rich array of imaging modalities, constitutes the bedrock of advanced object detection in intricate geographic environments. The BeiJing-2 satellite capitalizes on temporal acuity to monitor evolving targets, the BeiJing-3 satellite deploys stereoscopic prowess to fine-tune the resolution of subtle features in challenging topographies, and the GaoFen-2 satellite sustains exceptional imaging fidelity for precise target delineation. This combination of multi-source data sharpens detection accuracy across diverse settings, thus impacting recent methodological advances. Song et al. [
16] amalgamated multi-temporal high-resolution imagery from Worldview-3 and GaoFen-2, complemented by Sentinel-2 data, to improve urban water body extraction and bolster resilience across seasonal and contextual shifts. Fang et al. [
17] synthesized a dataset including high-resolution optical imagery from Worldview-2, Worldview-3, and GaoFen-2 to develop Swin-HSTPS, which excelled in discerning traffic port stations through multi-scale feature integration.
1.2.3. Oil and Gas Facility Detection Using Remote Sensing
Underpinning both energy resource management and environmental oversight, the detection of O&G facilities necessitates a sophisticated integration of advanced remote sensing technologies. Remote sensing, by providing high-resolution, multi-temporal imagery across extensive and heterogeneous landscapes, enables precise localization and continuous monitoring of O&G facilities. By virtue of spectral and textural feature extraction from remote sensing imagery, machine learning methods attained remarkable success in early O&G facility detection. Aljameel et al. [
18] compared five machine learning algorithms for pipeline anomaly detection, with support vector machine achieving an accuracy of 97.43%, though limited by predefined features in complex remote sensing contexts. Despite laying the groundwork, machine learning methods, which exhibit over-reliance on feature engineering and an inability to handle high-dimensional data, have driven the adoption of deep learning.
Deep learning harnesses powerful computational algorithms to address long-standing challenges, including small-target identification and environmental variability, satisfying the escalating need for precision and flexibility in O&G facility detection. He et al. [
19] refined the Mask R-CNN framework by using D-LinkNet as a backbone and implementing a semantic segmentation branch to unravel road-well links in multi-sensor imagery, thereby fortifying the reliability of oil well identification. In a related vein, Zhang et al. [
20] revamped YOLOv5 through the incorporation of instance segmentation, a context augmentation module, and normalized weighted distance, consummately facilitating oil well detection in occluded high-resolution remote sensing images and attaining significant hikes in
. Additionally, Guisiano et al. [
21] proved the effectiveness of their approach by fine-tuning YOLOv8, Faster R-CNN, and DETR on high-resolution Permian Basin satellite imagery to effectively map O&G infrastructure, and they used pre-trained models to enhance detection across this infrastructure. Extending this focus to synthetic aperture radar (SAR), which excels in all-weather monitoring, Ma et al. [
22] devised an end-to-end model based on a Transformer for 3D oil tank detection from single SAR images. This model incorporated incidence-angle priors and a feature-description operator to improve precision in the presence of dense scattering centers. In a parallel effort, Wu et al. [
23] developed YOLOX-TR, fusing a Transformer encoder and reparameterized visual geometry group-like blocks into YOLOX, to tackle dense oil tank detection and classification in large-scale SAR images, effectively mitigating overlaps and geometric distortions.
2. Methods
2.1. Oriented R-CNN
Oriented R-CNN, a two-stage deep learning framework specifically designed for oriented object detection, is built upon the well-known Faster R-CNN framework [
24]. In the first stage, a rotated Region Proposal Network (RPN) deploys a lightweight, fully convolutional network to generate high-quality oriented proposals at almost no cost. In the second stage, the oriented R-CNN head leverages rotated Region of Interest (RoI) alignment to extract features from each oriented proposal, followed by classification and regression to determine object categories and optimize bounding box coordinates. This methodology, which is supported by a ResNet backbone, ensures the effective detection of arbitrarily positioned objects across a wide range of scales and orientations.
In comparison with existing oriented object detection algorithms, Oriented R-CNN demonstrates superior performance in key aspects by addressing critical limitations. Single-stage detectors, such as Rotated RetinaNet, often struggle with degraded angle regression accuracy, constrained by their reliance on dense anchor-based predictions with rigid configurations [
25]. Conversely, Oriented R-CNN’s two-stage architecture enables precise proposal refinement, effectively overcoming such constraints. Additionally, when benchmarked against Rotated Faster R-CNN variants, Oriented R-CNN substantially reduces misalignment errors stemming from axis-aligned feature pooling [
26], thereby achieving enhanced precision in localization and orientation estimation for oriented objects. These distinctive strengths position it as an optimal solution for tasks demanding accurate orientation estimation.
The adoption of Oriented R-CNN as the baseline model for this study is strategically motivated by its exceptional compatibility with the unique challenges posed by our O&G facility dataset. Beyond the inherent orientation variability of the targets, the dataset introduces complexities through the sparse distribution of facilities and visually ambiguous backgrounds, such as sandy expanses interspersed with rocky formations. Oriented R-CNN effectively mitigates these issues by efficiently generating high-quality oriented proposals, which minimizes false positives in sparse regions, while its rotated RoI alignment mechanism ensures precise feature extraction despite background interference. Consequently, this framework excels in delivering robust detection performance, particularly tailored to the precise identification of the O&G facilities.
This study catalyzes a recalibration of Oriented R-CNN in the crucible of O&G facility detection, where accuracy and efficiency hinge on overcoming class imbalance, scale disparity, and rotational variance, as shown in
Figure 2. This work transcends conventional limitations by interweaving a bespoke loss function (O&G Loss Function), a discerning hard example mining mechanism (CAHEM), and a feature pyramid architecture augmented with attention-driven refinement (FPNFEA) (
Figure 3).
2.2. O&G Loss Function
The construction of efficacious loss functions constitutes a formidable challenge in the domain of oriented object detection, driven by the intricate complexities arising from sparse target distribution, disparate scale, and varied orientation, as shown in
Figure 2. Conventional methodologies, exemplified by standard IoU-based metrics, often exhibit limitations in simultaneously accommodating these multifaceted issues, thereby yielding suboptimal performance in contexts such as O&G facility detection across diverse environmental settings. To address these inadequacies, the O&G Loss Function is proposed (
Figure 4), meticulously engineered to augment detection efficacy by addressing the three above-mentioned problems. The O&G Loss Function (
) integrates a class-weighted GIoU loss (
), a scale-aware regression loss (
), and an orientation-adjusted angular loss (
) formulated as Equation (
1).
where
,
, and
are weighting coefficients tuned to balance the contributions of class distribution, scale variability, and orientation alignment, respectively.
2.2.1. Class-Weighted GIoU Loss Function
A striking long-tail distribution becomes apparent when examining the global deployment of O&G facilities. Rare entities like drillings are vastly outnumbered by prevalent targets such as well sites, engendering an intrinsic class imbalance that undermines detection performance [
7,
8]. Traditional IoU losses, by indiscriminately weighting all samples, overlook the imperative to elevate the priority of these underrepresented classes, thereby compromising recall for rare facilities. Moreover, their inability to furnish optimization gradients in cases where predicted and target boxes are non-overlapping. To circumvent these obstacles, GIoU is enlisted, leveraging a convex hull penalty to enable gradient-driven refinement even absent overlap [
27]. This study introduces a class-weighted variant of the GIoU loss function, formally defined as follows:
where
is expressed as follows:
where
and
are calculated as Equations (4) and (5).
Here,
denotes the class-specific weighting factor.
measures the overlap between the predicted
and the ground truth
rotated bounding box.
represents the convex hull penalty that facilitates gradient computation in non-overlapping cases.
indicates the smallest convex region enclosing both boxes, and
signifies the area.
2.2.2. Scale-Aware Regression Loss Function
The heterogeneous morphological profiles of O&G facilities give rise to scale disparities that compromise the efficacy of standard regression approaches [
28]. Traditional loss functions based on absolute coordinate deviations are insufficient for addressing the diverse variations in scale, thereby limiting their capacity for optimized calibration. This constraint motivates the introduction of a scale-aware weighting factor, as shown in Equation (
6), which is specifically designed to enhance regression stability through dynamic calibration in accordance with object area.
where
is the class weight, and
is the scale-aware weight, computed as follows:
where
represents the ground truth bounding box area (with width
and height
),
denotes the mean area across all ground truth bounding boxes, and
is a small constant to prevent division by zero. The regression term
is defined as the sum of absolute coordinate deviations, where
n represents the number of parameters defining the bounding box, and
and
denote the predicted and ground truth values of the
j-th parameter, respectively.
2.2.3. Orientation-Adjusted Angular Loss Function
O&G facilities, unlike objects with fixed orientations, adopt random angular configurations dictated by diverse terrains like deserts, posing a serious obstacle to orientation alignment precision in oriented object detection [
29]. Standard loss functions, primarily intended for horizontal bounding boxes, struggle to capture the angular deviations of oriented bounding boxes, leading to suboptimal localization performance in complicated situations. To counter such a deficiency, this study proposes an orientation-adjusted weighting factor
to adjust the regression penalty based on the angular difference between predicted and ground truth orientations. The orientation-adjusted regression loss is formulated as follows:
where
is the class weight, consistent with the scale-aware component, and
is the regression term, both of which are defined in
Section 2.2.2.
and
represent the predicted and ground truth rotation angles of the
i-th bounding box, respectively.
measures the angular deviation.
is a hyperparameter controlling the influence of orientation discrepancy.
2.3. CAHEM
CAHEM is introduced in this study to proficiently address the issues of categorical imbalance and orientational variability in oriented object detection for O&G facilities (
Figure 4). This section elaborates on this novel methodology, which employs a class-differentiated weighting architecture and an intricate difficulty-scoring framework to rectify the imbalance intrinsic to O&G datasets [
30]. Through this integration, CAHEM establishes a discerning strategy that prioritizes the enhancement of underrepresented classes while adeptly tackling orientation-divergent targets.
CAHEM quantifies the complexity of each positive sample through a constructed difficulty index, which synthesizes IoU, classification fidelity, and angular divergence. The index (
) is formulated as follows:
where IoU gauges the congruence between the predicted and ground-truth rotated bounding boxes.
represents the peak softmax probability for the assigned category.
captures the angular deviation.
and
are calibration parameters. The inclusion of
ensures acute sensitivity to rotational discrepancies, a pivotal element for achieving robust detection in O&G contexts.
The curation of arduous samples is organized by a weighted top-
k selection regimen, which ranks samples based on their difficulty scores adjusted by class-specific weights and selects the top
k samples, where
k is a fraction of total positive samples. This process is reinforced by weights derived from dataset frequency distributions, assigning lower values to more frequent classes and higher values to less frequent classes to emphasize underrepresented categories. The refined difficulty score (
) is subsequently articulated as follows:
where
denotes a class-specific weight derived from dataset statistics.
CAHEM provides a structured approach to strengthen O&G detection by strategically amplifying the influence of challenging samples. Its class-aware weighting and selection mechanism not only bridge the gap between imbalanced classes but also refine the model’s capacity to navigate the multifaceted demands of O&G terrains.
2.4. FPNFEA
The rugged terrain and intricate layout of O&G facilities engender considerable obstacles for oriented object detection, as targets like well sites exhibit wide-ranging scales and arbitrary orientations. Such variability often overwhelms conventional detection models, which struggle to balance multi-scale feature extraction with orientation awareness. This section presents FPNFEA, a method crafted to enhance multi-scale feature representations while amplifying responsiveness to rotational subtleties.
The FPNFEA framework extends the standard FPN by incorporating the FEA module [
31]. It retains the core FPN topology, generating a hierarchical feature pyramid from backbone outputs
through lateral convolutions (kernel size 3 × 3 to standardize channels to 256) and top-down fusion to produce feature maps
. To better accommodate O&G targets of varied scales, two targeted modifications are introduced. First, the lateral convolution kernel size is increased to 5 × 5, expanding the receptive field. Second, a lightweight FEA module is incorporated at each pyramid level, as shown in
Figure 5.
Central to the FPNFEA architecture is the FEA module, which employs channel-wise attention to selectively accentuate pivotal features. For each feature map
, global average pooling condenses spatial dimensions to
, producing a channel-wise descriptor. This descriptor is processed through two convolutional layers with a channel reduction factor of 2, followed by a ReLU activation and a sigmoid function, to compute attention weights. The process can be formalized as follows:
where
denotes the attention weights. GAP represents global average pooling.
and
are convolutional layers with a channel reduction factor of 2. ReLU is the activation function, and
is the sigmoid function. The refined feature map is then computed using the attention weights as follows:
where
is the enhanced feature map.
is the original feature map.
represents the attention weights broadcast across spatial dimensions. ⊙ denotes element-wise multiplication. Drawing inspiration from seminal attention mechanisms [
32], this approach focuses on features critical for discerning rotated objects within the cluttered O&G milieu. Through the synergistic integration of multi-scale feature extraction and attention-driven refinement, the FPNFEA framework adeptly localizes and classifies a spectrum of O&G targets.
3. Experimental Results
3.1. Dataset
This study designated the Tarim Basin in the Xinjiang Uygur Autonomous Region, China, as the research area, a region esteemed as one of China’s principal O&G resource reservoirs (
Figure 6). Its arid desert terrain, marked by sparse vegetation and negligible cloud cover, creates ideal conditions for acquiring high-resolution satellite imagery. This region’s complex topography, featuring aeolian dunes, rocky outcrops, and diffusely distributed related facilities, demands exceptional detection precision, rendering the basin an exemplary venue for identification tailored to O&G infrastructure.
To support this research, we employed high-resolution satellite images from multiple sources, including the BeiJing-2, BeiJing-3, and GaoFen-2 satellites (
Figure 7). The dataset comprises 3039 images, each containing 1 to 7 O&G facilities. The images in the dataset are 1024 × 1024 pixels with 0.8 m spatial resolution per pixel. Domain specialists annotated images methodically using the open-source program roLabelImg. For each labeled image, an XML file in the PASCAL VOC format was generated, containing the image size and bounding box coordinates (center, width, height, angle), adhering to the le90 angle convention. The dataset encompassed three categories: well sites (3006 instances), industrial and mining lands (692 instances), and drillings (244 instances).
The dataset was split into training, validation, and test sets in a 70:15:15 ratio using a stratified approach at the image level. To preserve the original class distribution as closely as possible, we first assigned each image to a group based on its dominant class. Within each group, images were then randomly divided using stratified shuffling, maintaining the overall proportion of dominant classes across the three splits. This strategy ensures that the severe class imbalance observed in the full dataset is consistently reflected in the training, validation, and test sets. Consequently, the training and validation sets together contained all 2553 well sites, 568 industrial and mining lands, and 204 drillings, while the test set included 453 well sites, 124 industrial and mining lands, and 40 drillings. This distribution preserves the challenging long-tail characteristics across all splits, providing a rigorous evaluation of model performance, particularly on minority classes.
3.2. Evaluation Metrics
Six metrics, namely precision, recall,
,
, mAP, and number of parameters, were selected as key metrics in this study to evaluate the performance of the deep learning models. Precision measures the proportion of correctly predicted O&G facilities among all instances predicted as positive, while recall indicates the proportion of actual O&G facilities that are correctly detected by the model.
, as the harmonic mean of precision and recall, provides a balanced evaluation of the model’s ability to reduce both false positives and false negatives. AP is calculated as the area under the precision-recall curve.
denotes the average precision when the IoU threshold is fixed at 0.50, offering a practical indicator of localization accuracy under moderate overlap requirements. mAP is the mean AP across all classes and serves as the primary performance criterion. The number of parameters (Params) quantifies model complexity in millions, providing insight into computational efficiency and deployment feasibility. These metrics are calculated by the following equations:
The computation of the above metrics is calculated based on the confusion matrix, where true positives (TP) are correctly detected O&G facilities, false positives (FP) are background regions incorrectly classified as facilities, false negatives (FN) are missed O&G facilities, and true negatives (TN) are correctly identified background regions.
3.3. Implementation Details
All experiments were conducted on a system with an NVIDIA GeForce RTX 4080 SUPER GPU (16 GB) and CUDA 11.8. The experimental setup adopted an image patch size of 1024 × 1024 pixels and a batch size of 2. All models were trained for 50 epochs with a learning rate of 0.005.
Hyperparameters specific to the proposed modules were selected via grid search on the validation set to maximize mAP. For the O&G Loss Function, the weighting coefficients were searched in the range [0.1, 2.0] with step 0.1, and the final values were set to , , and . In CAHEM, the difficulty scoring coefficients and were tuned in [0.1, 2.0], resulting in and , while the top-k fraction was selected as 0.7 from [0.5, 0.9]. For FPNFEA, the channel reduction factor in the attention bottleneck was fixed at 2, following standard squeeze-and-excitation designs. These values yielded the highest validation mAP and were adopted for final evaluation.
3.4. Results
This section evaluates the performance of OGF Oriented R-CNN for detecting O&G facilities, employing precision, recall, , , mAP, and parameters. as principal metrics The research elucidates the contributions of the O&G Loss Function, CAHEM, and FPNFEA, particularly in resolving the distinctive challenges posed by O&G facility detection through systematic comparisons with the state-of-the-art (SOTA) models and a comprehensive ablation study.
3.4.1. Comparative Study
OGF Oriented R-CNN is benchmarked against seven SOTA models: Faster R-CNN [
33], Gliding Vertex [
34], H2RBox-v2 [
35], Oriented R-CNN, OrientedFormer [
14], RoI Transformer [
26], and S2A-Net [
36]. These models span leading axis-aligned and rotation-aware designs, offering a stringent comparison amid extreme class imbalance and geometric complexity.
As detailed in
Table 1, OGF Oriented R-CNN attains the highest mAP of 82.9%, outperforming the baseline model (Oriented R-CNN) by 10.5 pp and other SOTA models by up to 27.6 pp. establishing unequivocal superiority in overall detection performance It achieves the leading
for well sites as well as industrial and mining lands, and records the highest precision and
in industrial and mining land and drilling. Faster R-CNN leads in well site precision and drilling
, yet exhibits lower industrial and mining land precision and an overall mAP of 79.7%. OrientedFormer exhibits the highest recall across all classes, reflecting its tendency to generate a large number of proposals that capture nearly all instances. However, this comes at the expense of the lowest precision among all evaluated models, resulting in a significantly reduced overall mAP. H2RBox-v2 has the lowest number of parameters, indicating high computational efficiency, yet this lightweight design is accompanied by markedly lower precision across all classes, leading to an mAP of 69.8%. The remaining rotation-aware models, including Gliding Vertex, RoI Transformer, and S2A-Net, deliver mAP ranging from 55.3% to 76.1%, with consistently inferior
in underrepresented classes. By contrast, OGF Oriented R-CNN sustains near-saturation recall in the dominant well site category while achieving substantial and consistent gains in precision,
, and
across industrial and mining land and drilling.
3.4.2. Ablation Study
An ablation study is conducted to clarify the contributions of the proposed components: (1) O&G Loss Function, (2) CAHEM, (3) FPNFEA. The performance of the baseline model (Oriented R-CNN) is reported in
Table 1. As evidenced in
Table 2, the O&G Loss Function markedly enhances precision and
across all categories while maintaining stable recall. The subsequent addition of CAHEM brings further improvement, particularly in the drilling class. OGF Oriented R-CNN augmented with FPNFEA achieves the best mAP, with the most significant advancement observed in the industrial and mining land class. When added individually, the O&G Loss Function yields +4.1 pp, CAHEM +3.8 pp, and FPNFEA +0.6 pp, with the combination demonstrating synergistic benefits.
Although partial models exhibit isolated superior metrics, these are offset by lower overall mAP and suboptimal results in other categories, rendering them inferior to OGF Oriented R-CNN. For instance, Oriented R-CNN with CAHEM achieves a better well site , yet it falls short with an overall mAP. Similarly, Oriented R-CNN with CAHEM and FPNFEA demonstrates superior well-site recall, although it is limited by an mAP of 77.3% and diminished in industrial and mining land and drilling. Furthermore, precision and for the drilling class are slightly lower in OGF Oriented R-CNN compared to the version without FPNFEA, but these gaps are minimal and closely comparable. OGF Oriented R-CNN, however, delivers a much higher and an overall mAP gain of +2.8 pp, which confirms its superior localization accuracy and consistent improvement across different O&G facility types.
3.4.3. Qualitative Evaluation
Beyond quantitative metrics, a qualitative examination reveals how OGF Oriented R-CNN performs in practical scenarios. Here, we evaluate detection performance by comparing OGF Oriented R-CNN against different methods on representative O&G facility scenes and assessing its capability to generate precise localizations, handle rotational and scale variations, and mitigate redundancy, especially for minority classes.
Faster R-CNN generates multiple overlapping rotated bounding boxes for the same target, as illustrated in
Figure 8b, culminating in over-detection of the minority class and, in some instances, complete failure to localize the drilling. These shortcomings directly explain its lower metrics on the minority class in
Table 1. OGF Oriented R-CNN, on the other hand, detects the drilling clearly and represents each industrial and mining land with a single compact rotated bounding box (
Figure 8c). Although a small degree of redundancy remains for well sites, the predictions are overall far more orderly and faithful to object structure.
Figure 9 traces the evolution of detection quality, from the baseline model to OGF Oriented R-CNN, across three hallmark O&G facility types. The baseline model detects most targets but generates numerous redundant and overlapping boxes (
Figure 9b). With O&G Loss, redundancy drops sharply, yet localization remains imprecise in the well site and drilling examples, and the industrial and mining land in the second column vanishes entirely (
Figure 9c). Adding CAHEM sharpens angular alignment and further culls duplicates, though a minor false positive appears in the industrial and mining land scene, indicating incomplete discrimination from background clutter (
Figure 9d). OGF Oriented R-CNN largely resolves these issues, producing cleaner detection results free of substantial redundancy, omissions, or false positives (
Figure 9e). Even so, slight angular inaccuracies are still observable in a small number of drilling instances under extreme rotation, which suggests that complete regression convergence remains a work in progress. Taken together, the improvements align with the gains in
Table 2, improving robustness for O&G facility monitoring. dovetail neatly with the quantitative results in
Table 2, affirming that OGF Oriented R-CNN represents a notable leap forward in managing class imbalance, scale variation, and rotational complexity.
The detection results in
Figure 10 display how OGF Oriented R-CNN performs in scenarios representative of O&G facilities’ key challenges (class imbalance, scale disparity, and rotational variance), maintaining stable prediction quality across varying conditions. Two well sites appear with pronounced orientation differences and partial obstruction from the surrounding terrain in
Figure 10d. OGF Oriented R-CNN captures each facility using rotated bounding boxes that closely match its true directional layout, indicating reliable handling of substantial angular variation.
Figure 10e includes a single drilling instance, the least represented category in the dataset. Despite its scarcity and the low-contrast desert background, OGF Oriented R-CNN produces a clear, correctly oriented detection, showing its ability to recognize minority-class objects even in visually homogeneous environments. The larger well sites are outlined with well-shaped rotated bounding boxes, whereas the smaller industrial and mining lands are marked with equally coherent boundaries.
Figure 10f serves as an example of the model’s performance when targets with substantial scale differences are present within a single scene.
4. Discussion
The analysis revealed that dataset features fundamentally governed detector performance in oriented object detection. Surprisingly, the axis-aligned Faster R-CNN achieved higher mAP than the most rotation-aware models, suggesting that rotational sophistication alone had been insufficient to overcome dataset bias when well-site instances dominate. Similarly, RoI Transformer and S2A-Net exhibited lower performance on minority classes, indicating that additional rotational flexibility could exacerbate false positives when class imbalance is not explicitly addressed.
The outstanding performance of OGF Oriented R-CNN against seven SOTA models provided clear evidence of its efficacy in this challenging domain. Certain baselines exhibited localized advantages in specific metrics. Faster R-CNN leveraged its robust axis-aligned localization to achieve high precision for the dominant well-site class. However, it encountered difficulties with industrial and mining lands, resulting in reduced accuracy for minority classes. OrientedFormer employed an aggressive proposal-generation strategy that captured nearly all instances, yielding the highest recall. This heightened sensitivity yielded the lowest precision among all models, underscoring the challenge of preserving specificity in severely imbalanced scenarios. H2RBox-v2 adopted a lightweight architecture that minimizes parameters and thus offers superior computational efficiency, but this compactness compromised detection accuracy across all classes. The other rotation-aware models, including Gliding Vertex, RoI Transformer, and S2A-Net, focused primarily on rotational modeling while inadequately mitigating class imbalance, yielding inconsistent results on underrepresented categories. In contrast, OGF Oriented R-CNN attained the highest overall mAP while delivering the most uniform and substantial advancements in precision, , and across the underrepresented classes. This outcome demonstrated that effective detection in such scenarios required coordinated interventions across loss design, training strategy, and feature extraction, rather than isolated enhancements in rotational modelling or efficiency.
What set the proposed OGF Oriented R-CNN apart was its incorporation of the O&G Loss Function, CAHEM, and FPNFEA, which collectively served to overcome class imbalance, scale disparity, and rotational variance. Adding the O&G Loss Function accounted for the largest single-step mAP gain (+4.1 pp) by restoring gradient contribution from minority classes. Subsequent incorporation of CAHEM further enhanced performance (+3.6 pp relative to the model with the O&G Loss Function), with particularly notable gains in the drilling class, where small and arbitrarily oriented targets benefited from focused regression. Applying FPNFEA contributed an additional +2.8 pp mAP, primarily by strengthening the multi-scale context for industrial and mining land without compromising prior gains. Minor reductions in drilling precision and in OGF Oriented R-CNN reflected the expected trade-off in multi-scale fusion, where richer features may introduce subtle boundary noise, yet the net mAP and gains indicated that generalization had improved overall. As these findings attested, targeted undertakings to rectify class imbalance, scale disparities, and rotational variances position OGF Oriented R-CNN as a pinpoint-accurate solution for O&G facility detection.
Although the evaluation was conducted exclusively on the Tarim Basin dataset, robustness considerations guided the inclusion of high-resolution imagery from multiple sensors (BeiJing-2, BeiJing-3, and GaoFen-2), introducing diversity in spectral response and acquisition conditions. This multi-source design enhances the model’s resilience to sensor-specific variations and provides a more representative testbed than single-sensor datasets. As additional publicly available oriented datasets for O&G facilities in other areas become accessible, future work will assess cross-region transferability and extend the framework to broader infrastructure categories.
5. Conclusions
Reliable identification of O&G facilities from high-resolution remote sensing imagery remains a fundamental yet unresolved challenge, fueled by the coexistence of severe inter-class imbalance, pronounced scale heterogeneity, and rotational ambiguity. When confronted with these difficulties, conventional detectors, whether axis-aligned or rotation-aware, frequently suffer from degraded precision.
This paper proposed OGF Oriented R-CNN, a novel detection model that systematically integrating three complementary modules: O&G Loss Function for adaptive class reweighting, scale-aware regression, and orientation-sensitive penalties; CAHEM for class-aware hard-example mining; and FPNFEA for attention-guided multi-scale fusion. Experimental results demonstrated that OGF Oriented R-CNN attained an mAP of 82.9%, representing a +10.5 pp improvement over the baseline model (Oriented R-CNN). The proposed model delivered substantial gains in precision and for minority classes while maintaining high recall in the dominant class, outperforming seven SOTA models by up to 27.6 pp. Comparisons highlighted potential for future hybrids, such as combining lightweight efficiency from models like H2RBox-v2 with better minority-class handling. Insights from comparative models, such as the computational efficiency of lightweight designs in H2RBox-v2 or the recall strength of transformer-based approaches in OrientedFormer, may inspire hybrid methods combining these advantages with the challenges in O&G facilities. Its efficacy was further corroborated through qualitative assessments, showing reduced duplicate detections and enhanced angular accuracy in diverse operational scenarios. OGF Oriented R-CNN can offer a reliable foundation for operational deployment in energy infrastructure surveillance and environmental monitoring. Performance remains limited in cases of extreme rotation, indicating opportunities for future improvements in geometric representations or the use of auxiliary data. Despite OGF Oriented R-CNN’s strong performance, modest limitations endure under extreme rotational conditions, pointing to future research avenues such as the development of more robust geometric representations and the effective integration of auxiliary information.
Author Contributions
Conceptualization, Y.Q. and Y.C.; methodology, Y.Q. and Y.C.; software, Y.Q. and Z.C.; validation, S.L., N.Z. and Z.C.; formal analysis, Y.Q.; investigation, Y.Q. and M.L.; resources, S.L., N.Z. and Y.C.; data curation, Z.C. and M.L.; writing—original draft preparation, Y.Q.; writing—review and editing, Y.Q., Y.C., S.L., N.Z., Z.C. and M.L.; visualization, Y.Q.; supervision, Y.C.; project administration, Y.C.; funding acquisition, S.L., N.Z. and Y.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Graduate Innovation Program of China University of Mining and Technology (Grant No. 2025WLJCRCZL004).
Data Availability Statement
The data presented in this study are available on request from the corresponding author due to commercial privacy restrictions.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- International Eneryg Agency. Available online: https://www.iea.org/reports/oil-2024/executive-summary (accessed on 14 March 2025).
- Xiong, S.; Tan, Y.; Li, Y.; Wen, C.; Yan, P. Subtask attention based object detection in remote sensing images. Remote Sens. 2021, 13, 1925. [Google Scholar] [CrossRef]
- Ramachandran, N.; Irvin, J.; Omara, M.; Gautam, R.; Meisenhelder, K.; Rostami, E.; Sheng, H.; Ng, A.Y.; Jackson, R.B. Deep learning for detecting and characterizing oil and gas well pads in satellite imagery. Nat. Commun. 2024, 15, 7036. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Xiao, P.; Yao, F.; Zhang, Q.; Gong, Y. Fusion of multi-scale attention for aerial images small-target detection model based on PARE-YOLO. Sci. Rep. 2025, 15, 4753. [Google Scholar] [CrossRef] [PubMed]
- Sun, Y.; Zhao, H.; Zhou, J. DKETFormer: Salient object detection in optical remote sensing images based on discriminative knowledge extraction and transfer. Neurocomputing 2025, 625, 129558. [Google Scholar] [CrossRef]
- Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep learning in remote sensing: A comprehensive review and list of resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
- S&P Global. Available online: https://www.spglobal.com/commodity-insights/en/research-analytics/drilled-but-uncompleted-wells (accessed on 22 March 2025).
- Global Energy Monitor. Available online: https://globalenergymonitor.org/projects/global-oil-gas-extraction-tracker/ (accessed on 22 March 2025).
- Pei, J.; Wu, X.; Liu, X.; Gao, L.; Yu, S.; Zheng, X. SGD-YOLOv5: A Small Object Detection Model for Complex Industrial Environments. In Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 30 June–5 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–10. [Google Scholar]
- Li, S.; Zhu, Z.; Sun, H.; Ning, X.; Dai, G.; Hu, Y.; Yang, H.; Wang, Y. Towards high-accuracy and real-time two-stage small object detection on FPGA. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 8053–8066. [Google Scholar] [CrossRef]
- Cao, F.; Wang, R.; Li, D.; Hu, Z. RS-DETR: An Improved DETR for High-Resolution Remote Sensing Image Object Detection. In Proceedings of the 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Kuching, Malaysia, 6–10 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1377–1382. [Google Scholar]
- Han, J.; Ding, J.; Xue, N.; Xia, G.S. Redet: A rotation-equivariant detector for aerial object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2786–2795. [Google Scholar]
- Zheng, S.; Wu, Z.; Xu, Y.; Wei, Z. Instance-aware spatial-frequency feature fusion detector for oriented object detection in remote-sensing images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5606513. [Google Scholar] [CrossRef]
- Zhao, J.; Ding, Z.; Zhou, Y.; Zhu, H.; Du, W.L.; Yao, R.; El Saddik, A. OrientedFormer: An end-to-end transformer-based oriented object detector in remote sensing images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5640816. [Google Scholar] [CrossRef]
- Li, Z.; Liu, W.; Xie, Z.; Kang, X.; Duan, P.; Li, S. FAA-Det: Feature Augmentation and Alignment for Anchor-Free Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5539411. [Google Scholar] [CrossRef]
- Song, S.; Liu, J.; Liu, Y.; Feng, G.; Han, H.; Yao, Y.; Du, M. Intelligent object recognition of urban water bodies based on deep learning for multi-source and multi-temporal high spatial resolution remote sensing imagery. Sensors 2020, 20, 397. [Google Scholar] [CrossRef] [PubMed]
- Fang, K.; Ouyang, J.; Hu, B. Swin-HSTPS: Research on target detection algorithms for multi-source high-resolution remote sensing images. Sensors 2021, 21, 8113. [Google Scholar] [CrossRef] [PubMed]
- Aljameel, S.S.; Alomari, D.M.; Alismail, S.; Khawaher, F.; Alkhudhair, A.A.; Aljubran, F.; Alzannan, R.M. An anomaly detection model for oil and gas pipelines using machine learning. Computation 2022, 10, 138. [Google Scholar] [CrossRef]
- He, H.; Xu, H.; Zhang, Y.; Gao, K.; Li, H.; Ma, L.; Li, J. Mask R-CNN based automated identification and extraction of oil well sites. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102875. [Google Scholar] [CrossRef]
- Zhang, Y.; Bai, L.; Wang, Z.; Fan, M.; Jurek-Loughrey, A.; Zhang, Y.; Zhang, Y.; Zhao, M.; Chen, L. Oil well detection under occlusion in remote sensing images using the improved YOLOv5 model. Remote Sens. 2023, 15, 5788. [Google Scholar] [CrossRef]
- Guisiano, J.E.; Moulines, É.; Lauvaux, T.; Sublime, J. Oil and gas automatic infrastructure mapping: Leveraging high-resolution satellite imagery through fine-tuning of object detection models. In Neural Information Processing, Proceedings of the International Conference on Neural Information Processing, Changsha, China, 20–23 November 2023; Springer: Singapore, 2023; pp. 442–458. [Google Scholar]
- Ma, C.; Zhang, Y.; Guo, J.; Hu, Y.; Geng, X.; Li, F.; Lei, B.; Ding, C. End-to-end method with transformer for 3-D detection of oil tank from single SAR image. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–19. [Google Scholar] [CrossRef]
- Wu, Q.; Zhang, B.; Xu, C.; Zhang, H.; Wang, C. Dense oil tank detection and classification via YOLOX-TR network in large-scale SAR images. Remote Sens. 2022, 14, 3246. [Google Scholar] [CrossRef]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3520–3529. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning RoI transformer for oriented object detection in aerial images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2849–2858. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Yang, X.; Yang, J.; Yan, J.; Zhang, Y.; Zhang, T.; Guo, Z.; Sun, X.; Fu, K. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27–28 October 2019; pp. 8232–8241. [Google Scholar]
- Shrivastava, A.; Gupta, A.; Girshick, R. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 761–769. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.S.; Bai, X. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 1452–1459. [Google Scholar] [CrossRef] [PubMed]
- Yu, Y.; Yang, X.; Li, Q.; Zhou, Y.; Da, F.; Yan, J. H2RBox-v2: Incorporating symmetry for boosting horizontal box supervised oriented object detection. Adv. Neural Inf. Process. Syst. 2023, 36, 59137–59150. [Google Scholar]
- Han, J.; Ding, J.; Li, J.; Xia, G.S. Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5602511. [Google Scholar] [CrossRef]
Figure 1.
Field photographs of representative O&G facilities. (a) Densely distributed well sites, (b) close-up of drillings.
Figure 1.
Field photographs of representative O&G facilities. (a) Densely distributed well sites, (b) close-up of drillings.
Figure 2.
Statistical characteristics of the O&G facility dataset. (a) Class distribution (well sites: 3006, industrial and mining lands: 692, drillings: 244) reveals extreme imbalance. (b) Object scale distribution evinces substantial heterogeneity. (c) Orientation distribution exhibits extensive rotational variances.
Figure 2.
Statistical characteristics of the O&G facility dataset. (a) Class distribution (well sites: 3006, industrial and mining lands: 692, drillings: 244) reveals extreme imbalance. (b) Object scale distribution evinces substantial heterogeneity. (c) Orientation distribution exhibits extensive rotational variances.
Figure 3.
Diagram of OGF Oriented R-CNN. (H, W) are the height and width of the input images. (x,y) are the center coordinates of the predicted proposal. (h,w) are the height and width of the external rectangular box of the predicted oriented proposal. is the rotation angle of the predicted proposal, determined according to the le90 annotation convention. The plus sign denotes element-wise addition for fusing the enhanced multi-scale features from FPNFEA with the backbone features.
Figure 3.
Diagram of OGF Oriented R-CNN. (H, W) are the height and width of the input images. (x,y) are the center coordinates of the predicted proposal. (h,w) are the height and width of the external rectangular box of the predicted oriented proposal. is the rotation angle of the predicted proposal, determined according to the le90 annotation convention. The plus sign denotes element-wise addition for fusing the enhanced multi-scale features from FPNFEA with the backbone features.
Figure 4.
Diagram of CAHEM and O&G Loss Function Framework. is the initial difficulty score, and is the class-weighted difficulty score using , the shared class weight derived from dataset frequency. The O&G Loss module integrates , , and , with also influencing these losses. Dynamic feedback optimizes based on loss gradients.
Figure 4.
Diagram of CAHEM and O&G Loss Function Framework. is the initial difficulty score, and is the class-weighted difficulty score using , the shared class weight derived from dataset frequency. The O&G Loss module integrates , , and , with also influencing these losses. Dynamic feedback optimizes based on loss gradients.
Figure 5.
Architecture of FPNFEA. Backbone features C2–C5 undergo 2× upsampling and element-wise addition to produce P2–P6. The FEA module first generates channel attention weights for each Pi by processing it through parallel branches (Conv 1 × 1, 3 × 3, 5 × 5, and Max Pooling 2 × 2), followed by AvgPool 1 × 1, ReLU, Sigmoid, and Conv 1 × 1 (Reduce and Restore). The Enhanced Pi is then obtained by element-wise multiplication of Pi with these weights.
Figure 5.
Architecture of FPNFEA. Backbone features C2–C5 undergo 2× upsampling and element-wise addition to produce P2–P6. The FEA module first generates channel attention weights for each Pi by processing it through parallel branches (Conv 1 × 1, 3 × 3, 5 × 5, and Max Pooling 2 × 2), followed by AvgPool 1 × 1, ReLU, Sigmoid, and Conv 1 × 1 (Reduce and Restore). The Enhanced Pi is then obtained by element-wise multiplication of Pi with these weights.
Figure 6.
Location of the study area.
Figure 6.
Location of the study area.
Figure 7.
Orbital information of the BeiJing-2, BeiJing-3, and GaoFen-2 satellites. SSO represents Sun-Synchronous Orbit.
Figure 7.
Orbital information of the BeiJing-2, BeiJing-3, and GaoFen-2 satellites. SSO represents Sun-Synchronous Orbit.
Figure 8.
Comparison of detection results between the two models. (a) Real images. (b) Predictions of Faster R-CNN. (c) Predictions of OGF Oriented R-CNN. Bounding boxes are colored red for well sites, yellow for industrial and mining lands, and green for drillings.
Figure 8.
Comparison of detection results between the two models. (a) Real images. (b) Predictions of Faster R-CNN. (c) Predictions of OGF Oriented R-CNN. Bounding boxes are colored red for well sites, yellow for industrial and mining lands, and green for drillings.
Figure 9.
Detection results with stepwise improvement for O&G facilities. (a) Real images. (b) Predictions of the baseline model (Oriented R-CNN). (c) Predictions of the baseline model with O&G Loss added. (d) Predictions of the baseline model with O&G Loss and CAHEM added. (e) Predictions of OGF Oriented R-CNN. From left to right, the three columns correspond to well sites, industrial and mining lands, and drillings, respectively. Bounding boxes are colored red for well sites, yellow for industrial and mining lands, and green for drillings.
Figure 9.
Detection results with stepwise improvement for O&G facilities. (a) Real images. (b) Predictions of the baseline model (Oriented R-CNN). (c) Predictions of the baseline model with O&G Loss added. (d) Predictions of the baseline model with O&G Loss and CAHEM added. (e) Predictions of OGF Oriented R-CNN. From left to right, the three columns correspond to well sites, industrial and mining lands, and drillings, respectively. Bounding boxes are colored red for well sites, yellow for industrial and mining lands, and green for drillings.
Figure 10.
Detection results of the proposed OGF Oriented R-CNN on challenging scenes. (a–c) Real images. (d–f) Predictions of OGF Oriented R-CNN. Bounding boxes are colored red for well sites, yellow for industrial and mining lands, and green for drillings.
Figure 10.
Detection results of the proposed OGF Oriented R-CNN on challenging scenes. (a–c) Real images. (d–f) Predictions of OGF Oriented R-CNN. Bounding boxes are colored red for well sites, yellow for industrial and mining lands, and green for drillings.
Table 1.
Evaluation metrics for different models.
Table 1.
Evaluation metrics for different models.
| Model | Class | Precision (%) | Recall (%) | F1 (%) | AP50 (%) | mAP (%) | Params (M) |
|---|
| Faster R-CNN | well_site | 62.8 | 98.2 | 76.6 | 90.3 | 79.7 | 41.13 |
| IM_land | 37.4 | 79.8 | 50.9 | 61.1 |
| drilling | 56.1 | 92.5 | 69.8 | 87.8 |
| Gliding Vertex | well_site | 53.1 | 97.4 | 68.7 | 86.1 | 76.1 | 41.13 |
| IM_land | 40.5 | 74.2 | 52.4 | 56.9 |
| drilling | 57.8 | 92.5 | 71.2 | 85.2 |
| H2RBox-v2 | well_site | 38.0 | 87.2 | 52.9 | 73.3 | 69.8 | 31.90 |
| IM_land | 2.9 | 78.2 | 5.6 | 50.6 |
| drilling | 11.7 | 90.0 | 20.7 | 85.5 |
| Oriented R-CNN | well_site | 41.3 | 98.5 | 58.2 | 89.3 | 72.4 | 41.13 |
| IM_land | 22.6 | 79.0 | 35.1 | 54.8 |
| drilling | 29.7 | 87.5 | 44.3 | 73.1 |
| OrientedFormer | well_site | 1.3 | 99.8 | 2.6 | 88.7 | 65.5 | 44.52 |
| IM_land | 0.1 | 96.8 | 0.2 | 47.1 |
| drilling | 0.9 | 97.5 | 1.8 | 60.7 |
| RoI Transformer | well_site | 24.8 | 98.9 | 39.7 | 78.2 | 55.3 | 55.04 |
| IM_land | 18.3 | 79.0 | 29.7 | 35.8 |
| drilling | 22.1 | 90.0 | 35.5 | 51.9 |
| S2A-Net | well_site | 45.7 | 96.7 | 62.0 | 79.9 | 62.3 | 38.54 |
| IM_land | 41.1 | 48.4 | 44.4 | 31.9 |
| drilling | 38.0 | 95.0 | 54.3 | 75.1 |
| OGF Oriented R-CNN | well_site | 57.5 | 98.5 | 72.6 | 90.4 | 82.9 | 49.52 |
| IM_land | 45.6 | 83.9 | 59.1 | 72.4 |
| drilling | 63.2 | 92.3 | 75.0 | 85.8 |
Table 2.
Evaluation metrics (%) for models with different modifications.
Table 2.
Evaluation metrics (%) for models with different modifications.
| Model | Class | Precision | Recall | | | mAP |
|---|
| +FPNFEA | well_site | 41.8 | 98.5 | 58.7 | 89.2 | 73.0 |
| IM_land | 23.0 | 78.2 | 35.6 | 56.2 |
| drilling | 29.4 | 87.5 | 44.0 | 73.5 |
| +CAHEM | well_site | 55.1 | 98.2 | 70.6 | 90.7 | 76.2 |
| IM_land | 37.2 | 75.8 | 49.9 | 62.2 |
| drilling | 44.0 | 92.5 | 59.7 | 75.8 |
| +O&G Loss Function | well_site | 55.5 | 98.2 | 70.9 | 90.3 | 76.5 |
| IM_land | 38.9 | 77.4 | 51.8 | 61.2 |
| drilling | 45.0 | 90.0 | 60.0 | 78.1 |
+CAHEM +FPNFEA | well_site | 53.1 | 98.7 | 69.0 | 90.5 | 77.3 |
| IM_land | 38.9 | 77.4 | 51.8 | 62.4 |
| drilling | 45.1 | 92.5 | 60.7 | 78.9 |
+O&G Loss Function +FPNFEA | well_site | 54.7 | 98.2 | 70.2 | 90.5 | 77.5 |
| IM_land | 38.7 | 75.8 | 51.2 | 64.3 |
| drilling | 43.5 | 92.5 | 59.2 | 77.7 |
+O&G Loss Function +CAHEM | well_site | 57.6 | 97.8 | 72.5 | 89.8 | 80.1 |
| IM_land | 40.6 | 73.4 | 52.3 | 64.9 |
| drilling | 64.4 | 95.0 | 76.8 | 85.7 |
+O&G Loss Function +CAHEM +FPNFEA (OGF Oriented R-CNN) | well_site | 57.5 | 98.5 | 72.6 | 90.4 | 82.9 |
| IM_land | 45.6 | 83.9 | 59.1 | 72.4 |
| drilling | 63.2 | 92.3 | 75.0 | 85.8 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |