Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models

Zou, Lingjun; Xu, Xin; Chen, Weitong; Hong, Qingqing; Wu, Di

doi:10.3390/rs18070985

Open AccessArticle

Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models

by

Lingjun Zou

¹

,

Xin Xu

^2,3

,

Weitong Chen

^2,3,*

,

Qingqing Hong

^2,3 and

Di Wu

⁴

¹

Information Technology Center, Jinling Institute of Technology, Nanjing 211169, China

²

College of Information and Artificial Intelligence (College of Industrial Software), Yangzhou University, Yangzhou 225127, China

³

Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, Yangzhou 225127, China

⁴

School of Computing, Engineering and Mathematical Science, La Trobe University, Melbourne, VIC 3086, Australia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(7), 985; https://doi.org/10.3390/rs18070985

Submission received: 24 February 2026 / Revised: 16 March 2026 / Accepted: 23 March 2026 / Published: 25 March 2026

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A robust and stealthywhite-box watermarking framework for RSODmodels is proposed.
The method achieves a 100% watermark verification success rate with negligible impact on detection accuracy.

What are the implications of the main findings?

The proposed method provides reliable intellectual property protection for remote sensing object detection models.
It maintains robustness and stealthiness under practical attacks such as fine-tuning and quantization.

Abstract

Remote sensing object detection (RSOD) models play an increasingly important role in modern remote sensing systems. However, during model delivery, sharing, and deployment, RSOD models face increasing risks of unauthorized redistribution, illegal replication, and intellectual property infringement. To mitigate these threats, this paper proposes a white-box watermarking framework for RSOD models that enables reliable copyright verification while preserving the performance of the primary detection task. Specifically, a gradient-based sensitivity analysis of the detection loss is first performed to adaptively identify model parameters that minimally affect detection performance, which are then selected as watermark carriers. Subsequently, a parameter-ranking-based watermark encoding scheme is developed, where watermark bits are embedded by enforcing relative ordering constraints between parameter pairs. To further improve robustness under practical deployment conditions, an attack-simulation-driven training strategy is introduced, in which common perturbations and watermark removal attacks are simulated during the embedding process. In addition, a stealthiness enhancement strategy based on statistical distribution constraints is designed to maintain consistency between the distribution of watermarked parameters and those of the original model, thereby reducing the risk of watermark exposure and localization. Extensive experiments across multiple RSOD datasets and detection architectures demonstrate that the proposed method achieves a high copyright verification success rate with negligible impact on detection accuracy and exhibits strong robustness and stealthiness against a variety of watermark removal attacks.

Keywords:

remote sensing object detection; intellectual property protection; white-box watermarking; copyright verification; deep learning

1. Introduction

With the continuous improvement in the availability and quality of high-resolution remote sensing imagery [1,2], together with the rapid development of deep learning techniques, remote sensing object detection (RSOD) models have been increasingly deployed in a wide range of real-world remote sensing applications, such as urban governance [3], disaster assessment [4,5], and ecological monitoring [6,7]. In these operational and mission-critical scenarios, high-performance RSOD models not only are essential for intelligent interpretation but also represent valuable digital assets with strong commercial value and intellectual property (IP) attributes. Such models typically rely on large-scale annotated datasets, substantial computational resources, and long-term iterative optimization [8]. However, in practical application pipelines, particularly during model delivery, sharing, and cloud- and device-side deployment, RSOD models are vulnerable to piracy, unauthorized duplication, and illegal resale [9,10]. These threats can severely compromise the economic interests and IP rights of model owners and further impede the secure and sustainable development of operational remote sensing systems. Therefore, there is an urgent need to develop an effective and practical IP protection method specifically designed for RSOD models.

In recent years, to address various security threats to models [11], embedding watermark information [12,13,14] into deep neural networks (DNNs) has emerged as an important and effective solution. The core idea of model watermarking is to embed copyright information associated with the model owner’s identity, such as secret keys, specific trigger behaviors, or verifiable structural characteristics, into the model while preserving its original task performance as much as possible. In this way, even if the model is subsequently copied, distributed, or tampered with during deployment and usage, the copyright owner can still detect or recover the embedded watermark through pre-designed verification or extraction mechanisms. This enables reliable ownership claims and provides forensic evidence for identifying infringement.

Specifically, existing watermarking techniques for DNN models can generally be categorized into two main types: black-box watermarking and white-box watermarking. Black-box watermarking methods [15,16,17,18,19,20,21,22] typically construct a set of specially designed trigger input samples during the embedding stage so that the model produces predefined abnormal output patterns when queried with these inputs. During the ownership verification stage, the copyright owner inputs verification samples to a suspicious model or service and determines the presence of the watermark by checking whether the output behavior matches the predefined watermark patterns. However, such methods rely on interactive black-box queries, making the ownership verification process prone to disputes. For example, it may be questioned whether the verifier intentionally selected special input samples or whether the accused party applied additional post-processing to the outputs. As a result, the evidence chain provided by black-box watermarking may lack sufficient transparency and judicial credibility.

In contrast, white-box watermarking methods [8,23,24,25,26,27,28] embed watermark information directly into a model’s internal components and bind it to internal information, such as parameters or intermediate representations. When the model’s internal information is exposed or when third-party forensic authentication and arbitration platforms are required for evidence collection, a subset of internal information containing the watermark can be extracted from the suspicious model. Ownership can subsequently be verified using a fixed and reproducible watermark extraction and verification algorithm. Since the watermark evidence is derived directly from the model’s internal information, the verification process is independent of external interactions or query interfaces, thus demonstrating stronger interpretability, stability, and credibility than black-box watermarking. In summary, for high-value RSOD models that are vulnerable to illegal copying and misuse, investigating and developing white-box watermarking protection methods based on internal model information is of great practical significance and application value in both real-world deployment and judicial forensics scenarios.

Despite providing more interpretable and trustworthy ownership evidence by leveraging a model’s internal information, existing white-box watermarking methods vary widely in their embedding locations and implementation forms [23,26,27]. Among them, weight-based white-box watermarking is particularly practical for RSOD models since it avoids architecture modification, has low embedding overhead, and supports efficient extraction and verification. However, existing weight-based white-box watermarking methods still suffer from evident limitations in terms of robustness and stealthiness. In particular, under common perturbations encountered in the deployment of RSOD models, watermark information is prone to extraction failure or increased exposure risk, thereby limiting its effectiveness. Specifically, white-box watermarking methods based on parameter projection [27] or sign-based alignment [8,29] typically encode watermark bits as the sign of a weight projection along a specific direction or as the result of a threshold decision. However, they are highly sensitive to numerical perturbations, and such perturbations can easily cause the underlying sign relationships to drift, thereby reducing verification reliability. Methods based on statistical distribution constraints [25,30] introduce regularization terms to enforce certain weights to follow predefined distributional patterns, thereby improving watermark extractability. However, such constraints may introduce detectable anomalies that compromise stealthiness and can also conflict with the primary task objective, leading to performance degradation. In addition, some methods that rely on specific architectural designs [31] are limited in terms of cross-architecture transferability and adaptation to mainstream CNN-based RSOD models, which hinders their general applicability. Therefore, designing a white-box watermarking mechanism that simultaneously achieves robustness, stealthiness, and transferability while preserving detection performance remains a key challenge.

To address this challenge, we propose a white-box watermarking method to protect the IP of RSOD models, which enhances robustness to common model perturbations and attacks while maintaining watermark imperceptibility. Specifically, we first evaluate the sensitivity of model parameters by analyzing the gradients of the detection loss, and we adaptively select weights that have minimal impact on the primary task performance as watermark carriers. Then, we employ a margin-based parameter-ranking encoding mechanism to embed watermark information by enforcing the relative ordering relationships between paired parameters. In addition, to further improve robustness against attacks, we introduce an attack-simulation training strategy that incorporates common perturbations into the optimization process. Finally, we introduce a regularization term based on the statistical properties of the weights to align the parameter distribution of the watermark-bearing layers with that of the original model, thereby further enhancing the stealthiness of the watermark.

The main contributions are summarized as follows:

We propose a novel white-box watermarking method that achieves both robustness and stealthiness for protecting the IP of RSOD models.
We design a margin-based parameter-ranking watermark embedding scheme, which encodes watermark bits by enforcing stable relative ordering constraints between paired model parameters, thereby enabling a high watermark verification success rate.
We introduce an attack-simulation-driven training strategy to improve robustness against watermark removal attacks, along with a statistical-distribution-constrained stealthiness loss to minimize detectable parameter perturbations.
Extensive experiments on multiple RSOD detectors and benchmark datasets demonstrate that the proposed method achieves a watermark success rate of 100% under the evaluation settings, with negligible impact on detection accuracy, and exhibits strong robustness and stealthiness against various watermark removal attacks.

The remainder of this paper is organized as follows. Section 2 reviews related work on black-box and white-box model watermarking. Section 3 introduces the preliminaries and problem formulation. The proposed white-box watermarking method for RSOD models is presented in Section 4. Extensive experimental evaluations, including threshold determination, fidelity, effectiveness, robustness, stealthiness, and ablation studies, are reported in Section 5. Finally, Section 6 concludes this paper and outlines future research directions.

2. Related Work

To protect the IP of DNN models, numerous model watermarking techniques have been proposed to verify model ownership and provenance. Depending on the watermark verification setting, existing DNN watermarking methods can generally be categorized into two types: black-box watermarking and white-box watermarking. In the following, this section presents and analyzes representative works from both categories.

2.1. Black-Box Model Watermarking Methods

Adi et al. [15] were among the first to propose a backdoor-based black-box watermarking framework. Their method constructs a set of abstract trigger images that are visually distinct from the original training data and associates them with predefined target labels during training. Without noticeably affecting the model’s normal inference performance, the watermarked model consistently exhibits the desired prediction behavior when queried with trigger samples, enabling watermark verification in a black-box setting. Following this line of research, subsequent works, such as BlackMarks [16] and the method by Zhang et al. [17], adopt more refined sample construction strategies to design key samples, improving the stability and robustness of trigger responses. To further enhance stealthiness, Li et al. [18] propose blending specific signatures (e.g., a unique logo) with features of natural samples, making trigger inputs visually closer to the benign data distribution and thereby reducing the risk of manual inspection or automated detection. Meanwhile, black-box watermarking research has gradually expanded beyond conventional image classification to broader model types and application scenarios. DeepMarks [19] integrates watermarking with traitor tracing and has been further extended to automatic speech recognition systems. Qiao et al. [20] and Li et al. [21] embed watermarks into images generated by generative models to address copyright ownership in generative tasks. In addition, for GNN watermarking, Zhao et al. [22] generate a random Erdos–Renyi trigger graph and embed the watermark into node-level predictions during training, which allows reliable black-box verification and remains robust under fine-tuning and model compression.

Although the above black-box watermarking methods can protect the copyright of DNN models to some extent, their verification procedures typically rely on black-box queries and statistical decision rules. As a result, they often fail to provide a transparent and traceable evidence chain, which limits their credibility and admissibility in legal scenarios.

2.2. White-Box Model Watermarking Methods

Unlike black-box watermarking methods, white-box watermarking leverages internal model information during ownership verification, which generally provides higher verification reliability and greater evidential transparency in legal scenarios. Depending on where the watermark is embedded, existing white-box watermarking methods can be broadly categorized into three groups: layer-based, activation-based, and weight-based watermarking.

Layer-based watermarking methods introduce dedicated watermark layers or modules into the network, tightly binding the watermark to the model architecture and inference process, thereby strengthening the credibility of ownership claims. For example, DeepIPR proposed by Fan et al. [23] embeds passport layers into the network and performs dual ownership verification through both performance validation and parameter extraction. Later, Zhang et al. [24] embed watermark information into the parameters of normalization layers without significantly altering the overall model structure, improving stealthiness and stability. However, such methods typically require explicit architectural modifications and additional parameter constraints, which introduce extra training and inference overhead and may reduce deployment flexibility and cross-architecture compatibility.

To mitigate the dependency on model architecture, activation-based white-box watermarking methods have been proposed. Instead of directly encoding watermarks into weights, these methods constrain intermediate activation distributions under specific inputs, embedding the watermark into hidden representations. A representative work by Rouhani et al. [25] encodes binary signatures into activation map distributions across selected layers and extracts the watermark via activation statistics or trigger inputs. Lim et al. [26] embed watermarks into the hidden states of recurrent neural networks, enabling copyright protection for multimodal models such as image captioning. Although activation-based methods are less intrusive in terms of architectural modifications, their watermark stability often depends on specific trigger inputs and activation constraints, and may degrade under fine-tuning, architectural changes, or feature transfer.

Motivated by these limitations, weight-based white-box watermarking methods directly encode watermark information into model parameters, enabling high-capacity and easy-to-verify watermark embedding without altering the model’s inference behavior. Uchida et al. [27] first introduce an embedding regularization term during training to write binary watermarks into network weights and verify ownership via correlation-based detection. To enhance stealthiness, RIGA [28] adopts adversarial training to align the distribution of watermarked weights with that of clean models, reducing the risk of detection or removal. To improve robustness, Chen et al. [8] embed watermarks into a small set of critical weights and incorporate sign-alignment constraints with fine-tuning strategies to strengthen watermark stability. Nevertheless, existing weight-based white-box watermarking methods still face a fundamental trade-off between robustness and stealthiness: the watermark must remain stable under attacks such as fine-tuning and quantization while avoiding detectable artifacts in the statistical properties of model parameters. To address these challenges, this paper proposes a white-box watermarking method that simultaneously improves robustness and stealthiness, enabling more reliable model ownership verification and stronger protection of IP.

3. Preliminary and Problem Formulation

In this section, we first introduce the overall workflow of white-box watermarking, then present the metrics used to evaluate watermark performance, and finally provide a formal description of the problem.

3.1. The Main Pipeline of White-Box Watermarking

White-box watermarking aims to embed identifiable ownership information into a neural network model, enabling reliable verification when the model’s internal parameters or intermediate representations are accessible. In general, the white-box watermarking pipeline consists of three stages: watermark message construction, watermark embedding, and watermark verification.

In the watermark message construction phase, a watermark message

M_{w}

is first sampled from the message space M and then encoded into a structured watermark representation, which is subsequently embedded into the model’s carrier weights.

In the watermark embedding phase, let

f_{θ}

denote the original neural network model parameterized by weights

θ

. Given a training dataset

D = {(x_{i}, y_{i})}_{i = 1}^{N}

, where

x_{i}

and

y_{i}

denote the input and the corresponding label of the i-th training example and

N

is the number of training samples, the model is typically learned by minimizing the task loss

L_{t a s k} (f_{θ}, D)

. The objective of watermark embedding is to obtain a watermarked model

f_{θ^{*}}

such that: (i) it preserves the original task performance, and (ii) it embeds the watermark message

M_{w})

in a verifiable manner. To achieve this, watermark embedding is commonly formulated as a joint optimization problem:

θ^{*} = arg min_{θ} (L_{t a s k} (f_{θ}, D) + λ L_{w m} (f_{θ}, M_{w}))

(1)

where

λ > 0

controls the trade-off between task fidelity and watermark strength, and

L_{w m} (f_{θ}, M_{w})

denotes the watermark embedding loss. After optimization, the resulting watermarked model

f_{θ^{*}}

is distributed or deployed.

In the watermark verification phase, the model owner aims to determine whether the suspected model

{\tilde{f}}_{θ}

contains the embedded watermark. First, the owner applies an extraction function

g (\cdot)

to recover the watermark message from the suspected model:

\hat{M_{w}} = g ({\tilde{f}}_{θ}) .

(2)

Next, the extracted watermark

\hat{M_{w}}

is compared with the

M_{w}

using a similarity measure. If the similarity score

s (\hat{M_{w}}, M_{w})

exceeds a predefined threshold

τ

, the watermark is regarded as successfully verified:

V erify ({\tilde{f}}_{θ}) = \{\begin{matrix} 1, & s (\hat{M_{w}}, M_{w}) \geq τ, \\ 0, & otherwise . \end{matrix}

(3)

3.2. Watermark Performance Metrics

To comprehensively evaluate the performance of white-box watermarking, it is necessary to assess not only whether the watermark can be successfully verified but also whether the protected model remains useful and secure under various conditions. In this work, we evaluate watermarking performance from four key perspectives, including fidelity, effectiveness, stealthiness, and robustness:

Fidelity. It evaluates whether watermark embedding maintains the model’s original performance.
Effectiveness. This metric evaluates whether the watermark can be verified with high confidence when the suspect model is derived from the protected model.
Stealthiness. It measures whether the watermark is difficult for an adversary to detect or distinguish from normal model behavior.
Robustness. It characterizes the watermark’s resistance to adversarial attempts to remove or invalidate it.

3.3. Problem Formulation

In this work, we want to design a white-box watermarking for the RSOD model in which the model owner is allowed to access the internal parameters of a suspected model and aims to establish verifiable ownership through an embedded watermark.

Let

F_{θ}

denote an RSOD model parameterized by weights

θ

. The owner specifies a watermark message m and embeds it into the model parameters. A fundamental design question arises at the beginning: where should the watermark be embedded? Specifically, the owner must select a subset of model parameters (or intermediate representations) as the watermark carrier, denoted by

θ_{c} = S (θ)

. The carrier selection strategy

S (θ)

should provide sufficient capacity for encoding m while avoiding fragile or highly sensitive components that may lead to noticeable accuracy degradation or facilitate watermark removal. Moreover, the watermark embedding and verification procedures should be efficient and reliable, enabling the owner to extract or validate the watermark in a deterministic manner while preventing it from being easily detected and erased.

Therefore, our objective is to train a watermarked model

F_{θ^{*}}

while ensuring that it satisfies multiple requirements simultaneously: (1) Fidelity: watermark embedding should not significantly degrade the model utility, i.e., the performance drop

Δ A = A (F_{θ}) - A (F_{θ^{*}})

should be small. (2) Effectiveness: the watermark should be verified with high confidence on the watermarked model while remaining unlikely to be triggered in non-watermarked models. (3) Stealthiness: the watermark should not introduce abnormal patterns that can be easily detected or localized by an adversary, meaning the watermarked parameters should remain statistically indistinguishable from normal training variations. (4) Robustness: under a set of common attacks or transformations

T (\cdot)

(e.g., fine-tuning [32] and quantitative attack [33]), the watermark should also remain verifiable on the transformed model

{\tilde{F}}_{θ} = T (F_{θ^{*}})

.

4. The Proposed Method

4.1. Method Overview

The overview of the proposed method is illustrated in Figure 1 and Algorithm 1. The overall framework consists of three stages: watermark carrier analysis and selection, watermark embedding and optimization, and watermark extraction and verification. First, we analyze the sensitivity of model parameters to the detection task by computing gradients of the detection loss, and we adaptively select low-impact parameters as watermark carriers to preserve detection performance. Then, watermark bits are embedded using a parameter-ranking-based encoding mechanism that enforces relative magnitude relationships between selected parameter pairs. To improve robustness and stealthiness in practical scenarios, we further incorporate attack-simulation training during embedding and apply distribution constraints to reduce detectable statistical artifacts. Finally, the watermark is extracted using the same parameter-indexing rules, and ownership is verified by measuring the consistency between the extracted watermark and the original one.

Algorithm 1 Overall procedure of robust and stealthy watermark embedding and verification.

4.2. Sensitivity-Based Watermark Carrier Selection

In this section, we analyze and select the carrier-weight parameters for watermark embedding. To avoid noticeable degradation of model inference performance due to watermark embedding, the watermark carriers are preferentially selected from a set of weights with relatively limited impact on detection performance. Given that different functional modules in RSOD models play distinct roles in feature modeling and prediction, existing studies [34,35] have shown that feature extraction modules [36], such as the backbone, are primarily responsible for learning high-level semantic and spatial representations. As a result, perturbations to their parameters tend to cause more significant changes in the overall feature distribution and detection performance. In contrast, the detection head [37] mainly performs task-specific mapping and prediction based on already extracted features, and its detection accuracy is relatively insensitive to mild parameter perturbations or localized adjustments [38]. Therefore, to minimize the impact of watermark embedding on the primary task performance, our work prioritizes selecting a subset of weight parameters from the detection head to construct the initial candidate parameter set

W_{0}

.

Subsequently, to further filter and analyze the influence of each parameter

w_{i} \in W_{0}

on model inference performance under slight variations, we adopt a gradient-based sensitivity analysis approach for quantitative evaluation. This approach characterizes the impact of parameter variations on the primary training objective, thereby indirectly reflecting the contribution of different parameters to model inference performance. Specifically, let

L_{\det}

denote the primary training objective of the RSOD model. The sensitivity of

w_{i}

is measured by computing the gradient of

L_{\det}

, denoted as

s_{i}

. During the computation, only 10% of the training samples are used to measure

s_{i}

, which is estimated statistically through multiple forward and backward passes over several mini-batches. Specifically, the sensitivity metric for each parameter is obtained by averaging the squared gradient values across different mini-batches:

s_{i} = E [{(\frac{\partial L_{\det}}{\partial w_{i}})}^{2}]

(4)

where the expectation

E [\cdot]

is approximated by averaging the squared gradients over multiple mini-batches. A larger

s_{i}

indicates that small changes in

w_{i}

lead to more significant variations in the detection loss and inference performance, whereas a smaller

s_{i}

implies that the parameter has a relatively limited impact on model performance.

After completing the sensitivity evaluation, the candidate parameters

w_{i}

are ranked according to their sensitivity scores, and those with the lowest sensitivity values and sufficient quantity to meet the watermark embedding requirements are selected to form the final watermark carrier parameter set

W_{wm}

.

4.3. Watermark Embedding and Optimization

In this section, we first construct the watermark information

M_{w}

to be embedded. Then, by introducing a watermark embedding loss based on parameter-ranking relationships,

M_{w}

is embedded into the

W_{wm}

. Finally, to further enhance the robustness and stealthiness of the watermark, we design an attack-simulation training strategy and a distribution-constrained watermark stealthiness enhancement scheme, which are jointly optimized during the watermark embedding process.

4.3.1. Watermark Information Construction

To ensure the uniqueness and unpredictability of the watermark information, our work adopts a watermark construction strategy consistent with existing studies [8,27]. Specifically, a fixed-length binary sequence is randomly generated using the SHA-256 hash function [39] and used as the watermark information to be embedded. The generated watermark satisfies

M_{w} \in {0, 1}^{B}

, where

B

denotes the length of the watermark in bits.

4.3.2. Parameter-Ranking-Based Watermark Embedding

Unlike watermark embedding methods based on parameter projection or sign constraints, we propose a parameter-ranking-based watermark embedding strategy, in which watermark information is encoded by enforcing relative ordering relationships between pairs of model parameters. Specifically, for each watermark bit

b_{t} \in {0, 1}

in the

M_{w}

, we randomly sample

R

pairs of parameters from the

W_{wm}

, denoted as

(w_{i_{t, r}}, w_{j_{t, r}}), r = 1, \dots, R

, where

i_{t, r}

and

j_{t, r}

denote two distinct indices randomly sampled from the parameter index set, corresponding to a pair of model parameters involved in the ranking constraint.

To embed the watermark information into the carrier parameters, we define the following watermark embedding loss:

L_{wm} = \frac{1}{B R} \sum_{t = 1}^{B} \sum_{r = 1}^{R} log (1 + exp (m - c_{t} (w_{i_{t, r}} - w_{j_{t, r}})))

(5)

where

exp (\cdot)

denotes the exponential function, and the sign variable

c_{t}

is determined by the watermark bit

b_{t}

as

c_{t} = \{\begin{matrix} + 1, & b_{t} = 1, \\ - 1, & b_{t} = 0 . \end{matrix}

(6)

Specifically, when

b_{t} = 1

, the parameter difference is encouraged to satisfy

w_{i_{t, r}} - w_{j_{t, r}} \geq m

, whereas when

b_{t} = 0

, the desired constraint becomes

w_{i_{t, r}} - w_{j_{t, r}} \leq - m

.

By minimizing

L_{wm}

, the model parameters are guided to satisfy ranking constraints consistent with the embedded watermark bits, while preserving the original detection performance, thereby achieving stable and reliable watermark embedding.

4.3.3. Attack-Simulation-Driven Robust Training

To enhance the robustness of the embedded watermark against various perturbations that may occur during model distribution, we further propose an attack-simulation-driven robust training strategy, in which potential attack processes are explicitly incorporated into the training optimization framework. Specifically, we define a set of common parameter transformations and attack distributions

T

, and we randomly apply these transformations to the model parameters during training. In our work,

T

mainly includes quantization attacks

T_{q}

and fine-tuning attacks

T_{f}

.

During each training iteration, a transformation

T (\cdot)

is first sampled from the attack distribution

T

, and the sampled transformation is then applied to the current model parameters

θ

to generate a perturbed parameter set

T (θ)

. The watermark embedding loss is subsequently evaluated on the model after applying the parameter transformation, allowing the optimization process to account for the effects of parameter perturbations explicitly. Through this process, the watermark constraints are encouraged to remain satisfied not only for the original parameters but also for their attacked versions.

Based on this, inspired by the concept of Expectation over Transformation (EOT) [40], we formulate the watermark embedding objective as an expected robustness optimization problem under the attack distribution. The resulting objective function can be expressed as:

min_{θ} E_{T \sim T} [L_{wm} (T (θ))] .

(7)

By explicitly incorporating attack simulation during training, the model is guided to preserve the extractability and verifiability of the embedded watermark even after undergoing various forms of parameter perturbations.

4.3.4. Distribution-Constrained Stealthiness Enhancement Strategy

In this section, to prevent the embedded watermark from being detected and subsequently removed through statistical analysis, we further introduce a distribution-constrained stealthiness regularization term during training to mitigate the impact of watermark embedding on the statistical properties of model parameters.

Specifically, the parameter distribution of the original model without watermark embedding is taken as a reference, and the parameters of the watermarked model are constrained to remain statistically consistent with the original distribution. Let

w_{l}^{(0)}

denote the parameters of the l-th layer in the original model, and let

w_{l}

represent the corresponding parameters in the watermarked model. By matching the mean and variance of the parameter distributions at each layer, we construct a stealthiness loss defined as:

L_{stealth} = \sum_{l \in S} ({∥μ (w_{l}) - μ (w_{l}^{(0)})∥}^{2} + {∥σ (w_{l}) - σ (w_{l}^{(0)})∥}^{2})

(8)

where

μ (\cdot)

and

σ (\cdot)

denote the mean and standard deviation of the parameter distribution, respectively, and

S

denotes the set of carrier parameter layers subject to the stealthiness constraint.

4.3.5. Total Training Loss Function

By jointly considering watermark robustness and stealthiness constraints, the final training objective is formulated as:

L_{total} = L_{\det} + λ E_{T \sim T} [L_{wm} (T (θ))] + β L_{stealth}

(9)

where

λ

is a hyperparameter that balances watermark robustness, and

β

controls the weight of the stealthiness regularization term, trading off watermark embedding strength against parameter distribution consistency.

Through joint optimization of the above objective, the model is able to embed watermarks that are both robust and stealthy while preserving the original detection performance.

4.4. Watermark Extraction and Verification

In this section, watermark extraction and verification are performed on a suspected model to determine whether it contains the embedded copyright watermark, thereby enabling reliable ownership verification. The overall verification procedure consists of two stages: watermark extraction and watermark verification.

Watermark extraction: The objective of the watermark extraction stage is to recover the potentially embedded watermark bit sequence from the suspected model. Specifically, following the same parameter-indexing rules used during Section 4.3.2, candidate parameter pairs that may carry watermark information are extracted from the parameter set

θ^{'}

of the suspected model

F_{s}

. The extracted parameter pairs are denoted as

(w_{i_{t, r}}^{'}, w_{j_{t, r}}^{'}) \in θ^{'}

, where the indices

(i_{t, r}^{'}, j_{t, r}^{'})

are identical to those used during embedding. Accordingly, the watermark-related carrier parameter subset in the

F_{s}

can be expressed as

W_{wm}^{'} = {(w_{i_{t, r}}^{'}, w_{j_{t, r}}^{'}) ∣ t = 1, \dots, B, r = 1, \dots, R}

.

Subsequently, the watermark information is decoded using the same parameter-ranking-based encoding scheme. Specifically, the parameter difference for the r-th parameter pair corresponding to the t-th watermark bit is defined as

Δ_{t, r}^{'} = w_{i_{t, r}}^{'} - w_{j_{t, r}}^{'}

. Based on the sign of the parameter difference, a local decision bit is obtained as

{\hat{b}}_{t, r} = I [Δ_{t, r}^{'} \geq 0]

, where

I [\cdot]

denotes the indicator function. Since each

b_{t}

is embedded using

R

parameter pairs,

R

local decisions are obtained during decoding. These local decisions are aggregated to produce the final decoded watermark bit:

{{\hat{b}}_{t}, | t = 1, \dots, B}

.

Finally, the decoded watermark sequence of

F_{s}

is constructed as

{\hat{M}}_{w} = ({\hat{b}}_{1}, \dots, {\hat{b}}_{B}) \in {0, 1}^{B}

.

Watermark verification: After obtaining the

{\hat{M}}_{w}

from the

F_{s}

, it is compared with the original watermark

M_{w}

securely stored by the copyright owner to determine whether a valid watermark is present.

To quantitatively evaluate the success of watermark verification, we introduce the Watermark Success Rate (WSR), defined as:

WSR (M_{w}, {\hat{M}}_{w}) = \frac{1}{B} \sum_{t = 1}^{B} I [{\hat{b}}_{t} = b_{t}]

(10)

The WSR measures the bit-level consistency between the decoded and original watermarks and ranges from 0 to 100%. A higher WSR indicates a more reliable verification result, with WSR = 100% implying that the watermark is perfectly recovered.

In addition, in practical verification scenarios, model application may introduce certain perturbations, allowing for a small number of bit errors between the decoded and original watermarks. Therefore, a decision threshold

τ

is introduced. If WSR

\geq τ

, the

F_{s}

is determined to be a pirated model. Otherwise, it is regarded as a non-pirated model.

5. Experiment and Result Analysis

5.1. Experimental Setup

Models and datasets: In this study, two representative object detection frameworks widely adopted in RSOD were utilized for experimental evaluation. Specifically, the two-stage detector Faster R-CNN [41], equipped with a ResNet50 backbone and the feature pyramid network (FPN) [35], was selected, alongside the one-stage detector YOLOv5 [42], which employs CSPDarknet53 as its feature extraction network. Furthermore, experiments were conducted on three benchmark RSOD datasets, namely, NWPU VHR-10 [43], RSOD24 [44], and LEVIR [45], which are commonly used for performance assessment in remote sensing scenarios.

Evaluation metrics: To quantitatively evaluate the effectiveness of the proposed approach, two evaluation criteria are adopted: mean Average Precision (mAP) and WSR. The mAP serves as an indicator of the model’s performance on the primary object detection task and is obtained by computing the mean of Average Precision (AP) values across all categories. AP provides a comprehensive assessment of detection performance by jointly considering precision and recall in object localization and classification. Higher AP values reflect superior detection capability, indicating improved accuracy and completeness in identifying target objects. The WSR is used to measure the accuracy of watermark extraction by evaluating the correspondence between the embedded and extracted watermark bits. A higher WSR indicates a greater degree of bit-level consistency, leading to more reliable copyright verification.

Training configuration and details: All experiments were conducted on a workstation running the Windows operating system, equipped with an Intel i7-14700KF CPU and an NVIDIA GeForce RTX 4090 GPU. The proposed watermark embedding framework was implemented using the PyTorch 2.9.0 deep learning library. During training, the model parameters were optimized following the standard training pipeline of the corresponding detection frameworks. In addition, the initial watermark carriers were selected from the localization branch of the detection head. Furthermore, the margin parameter used in the ranking-based watermark embedding constraint was set to

m = 0.01

.

For the attack-simulation-driven robust training strategy, the simulated fine-tuning attack was implemented by temporarily fine-tuning the model on the original training dataset for 150 epochs with a learning rate of 0.01. The simulated quantization attack was implemented by temporarily quantizing the model parameters to lower numerical precision (e.g., FP16 or INT8) and then dequantizing them back to floating-point representation before the forward pass.

Compared methods: For a thorough and fair evaluation, several representative state-of-the-art watermarking methods under the white-box setting were selected for comparison, including Chen [8], Uchida [27], Tyagi [46], RIGA [28], DeepIPR [23] and Zhang [24]. Notably, the methods [23,24,27,28,46] were originally designed for image classification tasks. For a fair evaluation in the context of RSOD, we reimplemented these approaches and adapted them to the RSOD domain for comparison.

5.2. Threshold Setting

The threshold

τ

is introduced to determine whether the extracted watermark can be reliably identified for copyright verification, where a watermark is considered successfully detected if the WSR exceeds

τ

. To establish an appropriate value for

τ

, we design a comparative experiment that analyzes the watermark extraction behavior from both original models and watermarked models.

Specifically, we apply the watermark extraction algorithm to extract watermark information from the same indexed positions of the model weights, which serve as the designated watermark carrier as detailed in Section 4.2. For the original model, although no watermark is embedded, the extraction procedure is still applied to the corresponding weights and the resulting bit sequence is compared with the predefined watermark. In contrast, for the watermarked model, the same procedure is applied to the embedded weights, and the extracted bits are compared with the embedded watermark to obtain the corresponding WSR. Furthermore, to ensure statistical reliability, the experiment is conducted over multiple checkpoints. Starting from the fifth training epoch, watermark extraction is performed 100 consecutive times for the original model, resulting in 100 WSR measurements. The same procedure is applied to the watermarked model, also yielding 100 WSR values under identical conditions.

The experimental results are illustrated in Figure 2. The WSR values obtained from the original model fluctuate around 50%, which is consistent with the expected behavior of random bit matching. In contrast, the WSR values of the watermarked model remain consistently at 100%, demonstrating stable and accurate watermark extraction. This clear separation between the two distributions indicates a substantial margin for reliable threshold selection. Based on these observations, we set the threshold

τ = 75 %

in this work. This value effectively distinguishes watermarked models from original models, minimizing false positives while ensuring robust watermark verification.

5.3. Fidelity Evaluation

The purpose of the fidelity evaluation is to examine whether the proposed watermarking scheme preserves the original detection performance of RSOD models after watermark embedding. To this end, we compare the detection accuracy of original models and their watermarked counterparts on three benchmark RSOD datasets, namely, NWPU VHR-10, RSOD24, and LEVIR, using two representative detectors: Faster R-CNN and YOLOv5. Experimental results presented in Table 1 demonstrate that watermark embedding has a negligible impact on detection performance across all model–dataset combinations. Specifically, for Faster R-CNN, the mAP changes are −0.07%, +0.08%, and −0.05% on NWPU VHR-10, RSOD24, and LEVIR, respectively. Similarly, for YOLOv5, the corresponding mAP variations are −0.02%, +0.03%, and −0.04 percentage points. These extremely small fluctuations, which are within 0.1%, indicate that the proposed watermarking strategy does not meaningfully affect detection accuracy. In addition, as illustrated in Figure 3, we further visualize the inference results of the original model and the watermarked model on the same input samples, providing qualitative evidence that watermark embedding does not alter the detection behavior. Overall, the results demonstrate that the proposed method achieves high fidelity, successfully embedding watermarks while maintaining the original performance of RSOD models.

5.4. Effectiveness

In this section, we evaluate the effectiveness of the proposed watermarking method during training, with a particular focus on how quickly a reliable and verifiable watermark can be established. Comparative experiments are conducted on Faster R-CNN and YOLOv5 detectors across three RSOD benchmark datasets, namely, NWPU VHR-10, RSOD24, and LEVIR. Six representative white-box watermarking methods are selected as baselines, and the evolution of the WSR over training epochs is analyzed. As shown in Figure 4, although all compared methods eventually achieve a WSR of 100%, the proposed method consistently exhibits a significantly faster convergence behavior, reaching and maintaining the maximum WSR in earlier training epochs. In contrast, the baseline methods require more training iterations to gradually reach the same level of watermark detectability. Notably, the WSR of the proposed method exceeds the predefined threshold

τ

at an earlier stage of training, thereby enabling effective ownership verification at earlier epochs. This indicates that the proposed method is able to establish a reliable and verifiable watermark at an early stage of training, which is particularly desirable for practical RSOD model protection and ownership verification scenarios.

5.5. Stealthiness

In this subsection, we evaluate the stealthiness of the proposed watermarking method, i.e., whether watermark embedding introduces detectable artifacts in the model parameters that could facilitate watermark detection. To this end, we compare the parameter value distributions of watermark-bearing layers between the watermarked model and the corresponding original model. Specifically, for each method, we collect the weights from the layers used for watermark embedding and visualize their empirical distributions (frequency vs. parameter value). If watermark embedding causes abnormal parameter patterns, the distributions of the watermarked and clean models are expected to diverge; otherwise, they should remain highly overlapped.

The experimental results are shown in Figure 5, where Figure 5a–f correspond to Chen, Uchida, Tyagi, RIGA, DeepIPR and Zhang, respectively, and Figure 5g corresponds to the proposed method. As illustrated, the baseline methods exhibit varying degrees of distribution discrepancy between the watermarked and original models, indicating that watermark embedding may alter the weight statistics and potentially leave detectable signatures. In contrast, our method shows a high degree of overlap between the two distributions, suggesting that the statistical characteristics of the watermark-bearing parameters are well preserved. Overall, these observations demonstrate that the proposed stealthiness enhancement strategy effectively maintains distributional consistency between the watermarked parameters and those of the original model, thereby reducing detectable watermark traces and improving the imperceptibility of watermark embedding.

5.6. Robustness

In this section, model watermarking is evaluated under common attack operations such as fine-tuning and quantization attacks, which may potentially undermine ownership verification. To examine the robustness of the proposed method under these conditions, we conduct extensive experiments involving fine-tuning and quantization attacks.

5.6.1. Robustness Against Fine-Tuning

To evaluate robustness against fine-tuning attacks, we simulate a practical post-deployment scenario in which an adversary continues training a stolen (watermarked) RSOD model on the target task to adapt it to new data distributions or to intentionally weaken the embedded watermark. Specifically, we take the watermarked models produced by different methods and perform additional fine-tuning for 100 epochs while periodically extracting the watermark and recording the WSR throughout the fine-tuning process. Experiments are conducted on two representative detectors, Faster R-CNN and YOLOv5, across the considered RSOD datasets, and the WSR evolution during fine-tuning is summarized in Figure 6.

The results show that the proposed method consistently maintains a high WSR throughout the fine-tuning process, remaining stable at 100% across all settings, which indicates strong resistance to parameter variations introduced by continued optimization. In contrast, the baseline methods exhibit clear degradation trends: their WSR values gradually decrease as fine-tuning progresses, and the decline becomes more evident at later epochs, suggesting that their watermark representations are more sensitive to weight updates. Overall, these results demonstrate that the proposed watermarking strategy offers strong robustness against fine-tuning-based watermark removal, enabling reliable ownership verification even when the suspect model has undergone substantial post-training adaptation.

5.6.2. Robustness Against Quantitative Attack

To evaluate robustness against quantization attacks, we apply post-training quantization to the watermarked RSOD models and examine whether the embedded watermark remains verifiable under reduced numerical precision. Specifically, three quantization settings are considered, including fp32, fp16, and int8. Experiments are conducted on two representative detectors (Faster R-CNN and YOLOv5) and two RSOD benchmark datasets (NWPU VHR-10 and RSOD24). For each method, both mAP and WSR are reported after quantization, as summarized in Table 2. Overall, quantization causes only minor fluctuations in mAP for all methods, indicating that detection accuracy is largely preserved across different precision levels. However, the watermark verification performance shows clear differences: the baseline methods generally suffer noticeable WSR degradation under fp16 and int8 quantization, suggesting that their watermark representations are sensitive to precision reduction. In contrast, the proposed method consistently maintains WSR = 100% across all tested detectors, datasets, and quantization settings. These results demonstrate that the proposed watermarking scheme is highly robust against quantization-induced numerical perturbations, enabling reliable ownership verification in practical deployment scenarios involving model compression.

5.7. Ablation Study on Watermark Length

To investigate the influence of watermark length on both detection performance and watermark verifiability, we conduct an ablation study on the NWPU VHR-10 dataset using two representative detectors, Faster R-CNN and YOLOv5, while keeping all other training and embedding settings unchanged. Specifically, the watermark length is set to 64, 128, 256, 512, and 1024 bits, and for each configuration we report the mAP together with the WSR, as shown in Figure 7. The results indicate that the proposed method consistently achieves WSR = 100% across all tested watermark lengths for both detectors, demonstrating that watermark verification remains reliable even when the watermark capacity increases. Meanwhile, the mAP values only exhibit minor fluctuations under different watermark lengths, suggesting that the proposed embedding scheme maintains high fidelity. Notably, the highest detection accuracy is obtained when the watermark length is set to 128 bits for both Faster R-CNN and YOLOv5. Therefore, we adopt 128 bit watermarks as the default setting in this paper, as it provides the best trade-off between detection performance and watermark capacity while preserving perfect watermark verification.

6. Conclusions and Future Works

In this work, we proposed a robust and stealthy white-box watermarking framework for protecting the IP of RSOD models. By analyzing parameter sensitivity through gradient information, the proposed method adaptively selects watermark carriers that have minimal impact on the primary detection task. A margin-based parameter-ranking mechanism is then employed to encode watermark information by enforcing stable relative ordering relationships between paired parameters. To further enhance robustness under practical deployment conditions, an attack-simulation training strategy is introduced to improve resistance to common perturbations, while a distribution-alignment regularization term is incorporated to preserve the statistical characteristics of the original model and enhance watermark stealthiness. Extensive experimental results on multiple detectors and RSOD benchmark datasets demonstrate that the proposed method consistently achieves a WSR of 100% under the evaluation settings while introducing negligible degradation in detection performance and maintaining strong robustness and imperceptibility.

Future work will focus on improving the generality and security of the proposed framework. In particular, we will investigate its transferability and extend it to other vision tasks (e.g., semantic segmentation and change detection). Moreover, we will evaluate the method under more stringent threat models, including cross-architecture transfer and adaptive removal attacks, to further strengthen robustness in realistic adversarial scenarios. In addition, we will explore more advanced stealthiness evaluation strategies, such as classifier-based detection approaches and adversarial detection scenarios, to further examine whether the embedded watermark can be detected by sophisticated statistical or learning-based detectors. These directions are expected to advance the development of reliable and practically deployable model watermarking techniques.

Author Contributions

Conceptualization, L.Z. and X.X.; methodology, X.X. and L.Z.; validation, L.Z. and W.C.; writing—original draft preparation, L.Z. and X.X.; writing—review and editing, W.C. and D.W.; supervision, W.C.; funding acquisition, W.C. and Q.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (No. 42201444) and the Natural Science Foundation of Jiangsu Province (No. BK20240898).

Data Availability Statement

The data associated with this research are available online. The NWPU VHR-10 dataset is available at https://www.kaggle.com/datasets/kevin33824/nwpu-vhr-10 (accessed on 25 September 2025). The RSOD24 dataset is available at https://www.kaggle.com/datasets/kevin33824/rsod24 (accessed on 25 September 2025). The LEVIR dataset is available at https://aistudio.baidu.com/datasetdetail/53714 (accessed on 25 September 2025). At the same time, we have made sensible citations in the paper.

Acknowledgments

During the preparation of this manuscript, the authors used GPT-5.2 and Gemini-3 for the purposes of language polishing and grammatical checking. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Deng, Z.; Sun, H.; Zhou, S.; Zhao, J.; Lei, L.; Zou, H. Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 2018, 145, 3–22. [Google Scholar] [CrossRef]
Zhang, G.; Lu, S.; Zhang, W. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 10015–10024. [Google Scholar] [CrossRef]
Yu, D.; Fang, C. Urban remote sensing with spatial big data: A review and renewed perspective of urban studies in recent decades. Remote Sens. 2023, 15, 1307. [Google Scholar] [CrossRef]
Pi, Y.; Nath, N.D.; Behzadan, A.H. Convolutional neural networks for object detection in aerial imagery for disaster response and recovery. Adv. Eng. Inform. 2020, 43, 101009. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, T.; Wang, G.; Zhu, P.; Tang, X.; Jia, X.; Jiao, L. Remote sensing object detection meets deep learning: A metareview of challenges and advances. IEEE Geosci. Remote Sens. Mag. 2023, 11, 8–44. [Google Scholar] [CrossRef]
Ji, X.; Zhang, Y.; Wan, J.; Yang, T.; Yang, Q.; Liao, J. Ecological monitoring of invasive species through deep learning-based object detection. Ecol. Indic. 2025, 175, 113572. [Google Scholar] [CrossRef]
Gui, S.; Song, S.; Qin, R.; Tang, Y. Remote sensing object detection in the deep learning era—A review. Remote Sens. 2024, 16, 327. [Google Scholar] [CrossRef]
Chen, W.; Xu, X.; Ren, N.; Zhu, C.; Cai, J. Copyright verification and traceability for remote sensing object detection models via dual model watermarking. Remote Sens. 2025, 17, 481. [Google Scholar] [CrossRef]
Xue, M.; Zhang, Y.; Wang, J.; Liu, W. Intellectual property protection for deep learning models: Taxonomy, methods, attacks, and evaluations. IEEE Trans. Artif. Intell. 2021, 3, 908–923. [Google Scholar] [CrossRef]
Sun, Y.; Liu, T.; Hu, P.; Liao, Q.; Fu, S.; Yu, N.; Guo, D.; Liu, Y.; Liu, L. Deep intellectual property protection: A survey. arXiv 2023, arXiv:2304.14613. [Google Scholar] [CrossRef]
He, Y.; Meng, G.; Chen, K.; Hu, X.; He, J. Towards security threats of deep learning systems: A survey. IEEE Trans. Softw. Eng. 2020, 48, 1743–1770. [Google Scholar] [CrossRef]
Xu, X.; Wang, Z.; Chen, W.; Tang, W.; Ren, N.; Zhu, C. Sensitive object trigger-based fragile watermarking for integrity verification of remote sensing object detection models. Remote Sens. 2025, 17, 2379. [Google Scholar] [CrossRef]
Yang, P.; Lao, Y.; Li, P. Robust watermarking for deep neural networks via bi-level optimization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 14841–14850. [Google Scholar]
Sun, Y.; Liu, L.; Yu, N.; Liu, Y.; Tian, Q.; Guo, D. Deep Watermarking for Deep Intellectual Property Protection: A Comprehensive Survey. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4697020 (accessed on 11 August 2025).
Adi, Y.; Baum, C.; Cisse, M.; Pinkas, B.; Keshet, J. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, USA, 15–17 August 2018; pp. 1615–1631. [Google Scholar]
Chen, H.; Rouhani, B.D.; Koushanfar, F. Blackmarks: Blackbox multibit watermarking for deep neural networks. arXiv 2019, arXiv:1904.00344. [Google Scholar] [CrossRef]
Zhang, J.; Gu, Z.; Jang, J.; Wu, H.; Stoecklin, M.P.; Huang, H.; Molloy, I. Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 4 June 2018; pp. 159–172. [Google Scholar]
Li, Z.; Hu, C.; Zhang, Y.; Guo, S. How to prove your model belongs to you: A blind-watermark based framework to protect intellectual property of DNN. In Proceedings of the 35th Annual Computer Security Applications Conference, San Juan, PR, USA, 9–13 December 2019; pp. 126–137. [Google Scholar]
Chen, H.; Rohani, B.D.; Koushanfar, F. Deepmarks: A digital fingerprinting framework for deep neural networks. arXiv 2018, arXiv:1804.03648. [Google Scholar] [CrossRef]
Qiao, T.; Ma, Y.; Zheng, N.; Wu, H.; Chen, Y.; Xu, M.; Luo, X. A novel model watermarking for protecting generative adversarial network. Comput. Secur. 2023, 127, 103102. [Google Scholar] [CrossRef]
Li, Q.; Wang, X.; Ma, B.; Wang, X.; Wang, C.; Gao, S.; Shi, Y. Concealed attack for robust watermarking based on generative model and perceptual loss. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 5695–5706. [Google Scholar] [CrossRef]
Zhao, X.; Wu, H.; Zhang, X. Watermarking graph neural networks by random graphs. In Proceedings of the 2021 9th International Symposium on Digital Forensics and Security (ISDFS), Elazığ, Turkey, 28–29 June 2021; IEEE: New York, NY, USA, 2021; pp. 1–6. [Google Scholar]
Fan, L.; Ng, K.W.; Chan, C.S.; Yang, Q. Deepip: Deep neural network intellectual property protection with passports. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6122–6139. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Chen, D.; Liao, J.; Zhang, W.; Hua, G.; Yu, N. Passport-aware normalization for deep model protection. Adv. Neural Inf. Process. Syst. 2020, 33, 22619–22628. [Google Scholar]
Darvish Rouhani, B.; Chen, H.; Koushanfar, F. Deepsigns: An end-to-end watermarking framework for ownership protection of deep neural networks. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, RI, USA, 13–17 April 2019; pp. 485–497. [Google Scholar]
Lim, J.H.; Chan, C.S.; Ng, K.W.; Fan, L.; Yang, Q. Protect, show, attend and tell: Empowering image captioning models with ownership protection. Pattern Recognit. 2022, 122, 108285. [Google Scholar] [CrossRef]
Uchida, Y.; Nagai, Y.; Sakazawa, S.; Satoh, S. Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Bucharest, Romania, 6–9 June 2017; pp. 269–277. [Google Scholar]
Wang, T.; Kerschbaum, F. Riga: Covert and robust white-box watermarking of deep neural networks. In Proceedings of the Web Conference, Ljubljana, Slovenia, 19–23 April 2021; pp. 993–1004. [Google Scholar]
Liu, H.; Weng, Z.; Zhu, Y. Watermarking Deep Neural Networks with Greedy Residuals. In Proceedings of the International Conference on Machine Learning (ICML), Online, 18–24 July 2021; Volume 139, pp. 6978–6988. [Google Scholar]
Nicolazzo, S. Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box Watermarking. arXiv 2025, arXiv:2512.16658. [Google Scholar]
Xu, H.; Xiang, L.; Ma, X.; Yang, B.; Li, B. Hufu: A modality-agnositc watermarking system for pre-trained transformers via permutation equivariance. arXiv 2024, arXiv:2403.05842. [Google Scholar]
Ding, N.; Qin, Y.; Yang, G.; Wei, F.; Yang, Z.; Su, Y.; Hu, S.; Chen, Y.; Chan, C.M.; Chen, W.; et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nat. Mach. Intell. 2023, 5, 220–235. [Google Scholar] [CrossRef]
Lopuhaä-Zwakenberg, M.; Budde, C.E.; Stoelinga, M. Efficient and generic algorithms for quantitative attack tree analysis. IEEE Trans. Dependable Secur. Comput. 2022, 20, 4169–4187. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar]
Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
Shi, P.; He, Q.; Zhu, S.; Li, X.; Fan, X.; Xin, Y. Multi-scale fusion and efficient feature extraction for enhanced sonar image object detection. Expert Syst. Appl. 2024, 256, 124958. [Google Scholar] [CrossRef]
Dai, X.; Chen, Y.; Xiao, B.; Chen, D.; Liu, M.; Yuan, L.; Zhang, L. Dynamic head: Unifying object detection heads with attentions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7373–7382. [Google Scholar]
Filters’Importance, D. Pruning filters for efficient convnets. arXiv 2016, arXiv:1608.08710. [Google Scholar]
Martino, R.; Cilardo, A. Designing a SHA-256 processor for blockchain-based IoT applications. Internet Things 2020, 11, 100254. [Google Scholar] [CrossRef]
Lee, S.; Kim, H.; Lee, J. Graddiv: Adversarial robustness of randomized neural networks via gradient diversity regularization. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2645–2651. [Google Scholar] [CrossRef] [PubMed]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
Jocher, G. Ultralytics YOLOv5, Version 7.0. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 25 September 2025). [CrossRef]
Wang, C.; Bai, X.; Wang, S.; Zhou, J.; Ren, P. Multiscale visual attention networks for object detection in VHR remote sensing images. IEEE Geosci. Remote Sens. Lett. 2018, 16, 310–314. [Google Scholar] [CrossRef]
Long, Y.; Gong, Y.; Xiao, Z.; Liu, Q. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2486–2498. [Google Scholar] [CrossRef]
Zou, Z.; Shi, Z. Random access memories: A new paradigm for target detection in high resolution aerial remote sensing images. IEEE Trans. Image Process. 2017, 27, 1100–1111. [Google Scholar] [CrossRef] [PubMed]
Tyagi, T.; Singh, K.N.; Singh, A.K.; Gupta, B.B. Deepverifier: Robust watermarking of deep neural networks based on black-box and white-box reasoning. IEEE Trans. Comput. Soc. Syst. 2025. [Google Scholar] [CrossRef]

Figure 1. Overview of the proposed watermarking method. The method consists of three main stages: (a) watermark carrier analysis and selection, (b) watermark embedding and optimization, and (c) watermark extraction and verification.

Figure 2. WSR comparison for threshold determination.

Figure 3. Visualization of detection results for the original and watermarked models. (a) Inference results produced by the original model. (b) Inference results produced by the watermarked model on the same input samples.

Figure 4. Effectiveness comparison in terms of WSR over training epochs. (a–c) Results of Faster R-CNN on NWPU VHR-10, RSOD24, and LEVIR, respectively. (d–f) Results of YOLOv5 on NWPU VHR-10, RSOD24, and LEVIR, respectively.

Figure 5. Stealthiness evaluation via parameter distribution comparison between watermarked and original models. (a) Chen, (b) Uchida, (c) Tyagi, (d) RIGA, (e) DeepIPR, (f) Zhang and (g) ours.

Figure 6. Robustness against fine-tuning attack. (a–c) correspond to Faster R-CNN evaluated on NWPU VHR-10, RSOD24, and LEVIR, respectively. (d–f) correspond to YOLOv5 evaluated on NWPU VHR-10, RSOD24, and LEVIR, respectively.

Figure 7. Effect of watermark length on detection performance and watermark verification. (a) Faster R-CNN. (b) YOLOv5.

Table 1. Impact of the proposed method on primary task detection accuracy.

Model Configuration	Models	NWPU VHR-10	RSOD24	LEVIR
Original model	Faster R-CNN	81.69	86.13	73.71
Original model	YOLOv5	93.78	97.25	95.69
Watermarked model	Faster R-CNN	81.62	86.21	73.66
Watermarked model	YOLOv5	93.67	97.28	95.65
Accuracy drop (%)	Faster R-CNN	$- 0.07$	$+ 0.08$	$- 0.05$
Accuracy drop (%)	YOLOv5	$- 0.02$	$+ 0.03$	$- 0.04$

Table 2. Quantization robustness evaluation under different precision settings.

Method	Quantization	Performance (%)
		Faster R-CNN				YOLOv5
		mAP	WSR	mAP	WSR	mAP	WSR	mAP	WSR
		NWPU VHR-10		RSOD24		NWPU VHR-10		RSOD24
Chen	fp32	81.15	99.92	85.73	100.00	93.17	98.31	97.32	99.96
	fp16	80.94	87.21	84.12	90.26	92.68	89.42	96.88	89.31
	int8	80.14	88.41	84.36	89.34	92.94	87.15	97.05	87.91
Uchida	fp32	80.92	99.26	84.79	99.52	93.05	99.27	97.18	100.00
	fp16	79.62	76.21	85.03	84.69	93.17	80.91	97.32	81.52
	int8	80.18	68.15	85.27	82.19	93.28	74.28	97.46	78.92
Tyagi	fp32	81.13	100.00	85.51	99.62	93.41	100.00	97.59	99.62
	fp16	80.92	95.16	85.74	94.16	93.56	92.85	97.74	96.25
	int8	80.04	92.17	85.89	92.15	93.72	89.27	97.91	94.82
RIGA	fp32	81.02	98.41	85.28	98.93	93.26	97.88	97.43	98.27
	fp16	80.86	89.26	85.11	90.37	93.18	88.41	97.24	89.63
	int8	80.33	85.37	85.36	87.12	93.31	84.96	97.52	86.45
DeepIPR	fp32	80.94	97.63	85.17	98.18	93.12	96.94	97.35	97.42
	fp16	80.72	84.91	84.96	86.73	93.05	83.68	97.18	85.14
	int8	80.21	79.48	85.24	82.06	93.22	78.22	97.41	80.47
Zhang	fp32	81.11	98.72	85.41	99.16	93.34	98.34	97.58	98.74
	fp16	80.98	91.84	85.22	92.47	93.27	91.12	97.39	92.06
	int8	80.41	88.16	85.47	89.54	93.39	87.83	97.67	89.21
Ours	fp32	81.69	100.00	85.62	100.00	93.88	100.00	98.03	100.00
	fp16	81.55	100.00	84.95	100.00	93.63	100.00	97.68	100.00
	int8	82.01	100.00	85.18	100.00	93.09	100.00	97.24	100.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zou, L.; Xu, X.; Chen, W.; Hong, Q.; Wu, D. Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models. Remote Sens. 2026, 18, 985. https://doi.org/10.3390/rs18070985

AMA Style

Zou L, Xu X, Chen W, Hong Q, Wu D. Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models. Remote Sensing. 2026; 18(7):985. https://doi.org/10.3390/rs18070985

Chicago/Turabian Style

Zou, Lingjun, Xin Xu, Weitong Chen, Qingqing Hong, and Di Wu. 2026. "Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models" Remote Sensing 18, no. 7: 985. https://doi.org/10.3390/rs18070985

APA Style

Zou, L., Xu, X., Chen, W., Hong, Q., & Wu, D. (2026). Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models. Remote Sensing, 18(7), 985. https://doi.org/10.3390/rs18070985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Robust and Stealthy White-Box Watermarking for Intellectual Property Protection of Remote Sensing Object Detection Models

Highlights

Abstract

1. Introduction

2. Related Work

2.1. Black-Box Model Watermarking Methods

2.2. White-Box Model Watermarking Methods

3. Preliminary and Problem Formulation

3.1. The Main Pipeline of White-Box Watermarking

3.2. Watermark Performance Metrics

3.3. Problem Formulation

4. The Proposed Method

4.1. Method Overview

4.2. Sensitivity-Based Watermark Carrier Selection

4.3. Watermark Embedding and Optimization

4.3.1. Watermark Information Construction

4.3.2. Parameter-Ranking-Based Watermark Embedding

4.3.3. Attack-Simulation-Driven Robust Training

4.3.4. Distribution-Constrained Stealthiness Enhancement Strategy

4.3.5. Total Training Loss Function

4.4. Watermark Extraction and Verification

5. Experiment and Result Analysis

5.1. Experimental Setup

5.2. Threshold Setting

5.3. Fidelity Evaluation

5.4. Effectiveness

5.5. Stealthiness

5.6. Robustness

5.6.1. Robustness Against Fine-Tuning

5.6.2. Robustness Against Quantitative Attack

5.7. Ablation Study on Watermark Length

6. Conclusions and Future Works

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI