1. Introduction
With the expansion of the blue economy, robust ocean-monitoring systems are essential for ecosystem protection and scientific research [
1,
2]. As core components of ocean observation networks, moored buoys provide continuous long-term measurements across diverse sea states for environmental monitoring, early warning, and climate observation [
3]. Their spatial configuration determines network representativeness, coverage efficiency, and cost, making rational buoy siting under dynamic marine conditions both theoretically and practically significant [
4,
5,
6].
Buoy siting optimization aims to determine the optimal spatial deployment of a limited number of buoy stations in a target sea area, taking into account spatiotemporal environmental variability and resource constraints. Its ultimate objective is to maximize the overall effectiveness of the monitoring network. Existing studies mainly fall into three main categories: rule-based expert approaches, mathematical modeling-based spatial optimization, and semantic modeling with logic-based reasoning.
Rule-based expert approaches use operational knowledge, standard procedures, and historical experience to guide buoy siting. Ray et al. [
7] integrated expert rules with geospatial data using the Drools engine for maritime-traffic monitoring. Nagano et al. [
8] applied an expert framework for buoy deployment in the Philippine Sea based on ENSO-zone climate and TAO/TRITON practices. In the Bohai Sea, Han and Chen [
9] combined empirical rules with kernel-density analysis of historical deployments to identify high-frequency candidate zones with GIS support.
Mathematical modeling-based spatial optimization formulates buoy siting as multi-objective problems—minimizing cost and maximizing coverage or uniformity—solved by heuristic algorithms such as Particle Swarm Optimization (PSO), simulated annealing, and multi-objective evolutionary algorithms (MOEAs) [
10,
11,
12,
13]. MOEAs have become a mature foundation for complex observation-network optimization [
14]. Early work by Xu et al. [
15] introduced a GA-based location–allocation model that jointly optimized site and distribution costs. For emergency monitoring, Liang et al. [
3] employed a fuzzy analytic hierarchy process to site rescue bases along the Yangtze River. Kim et al. [
16] developed a sensitivity-informed additional-buoy strategy to improve network efficiency. Yakovlev et al. [
17] focused on the theoretical formulation of continuous maximum-coverage models, establishing a nonlinear maximum-coverage model via computational geometry and local search. Heyns [
18] designed a multi-observation-point siting method for large-scale deployment by combining dimensionality reduction, balancing quality and cost. Ferrolino et al. [
19] formulated a PSO model coupling nonlinear shallow-water equations with heuristic optimization for tsunami-sensor deployment. In polar applications, Kim et al. [
20] optimized an Arctic buoy network using ensemble-sensitivity analysis with an Ensemble Square Root Filter to reduce forecast-error variance.
Semantic modeling with logic-based reasoning, grounded in Semantic Web standards and ontology languages, enables structured knowledge representation and automated reasoning under complex constraints [
21]. It has been applied in urban planning, environmental monitoring, and emergency management, and is expanding toward knowledge management and intelligent decision making in marine systems [
22,
23]. Wen et al. [
24] developed a multi-level ontology for geological hazards to support structured emergency response, while Tambass [
25] and Liu [
26] established unified ontologies and knowledge graphs for hospital and urban facility siting, improving the expressiveness of siting logic. Tang et al. [
27] and Gayathr and Kannan [
28] embedded ontological structures in knowledge retrieval and medical document semantics, respectively, addressing weak matching and ambiguity in traditional methods. In the marine domain, Zhang et al. [
29] built a spatio-temporal ocean-circulation ontology integrating multiscale data for consistent reasoning; Velu and Thangavelu [
30] designed a marine semantic model to enhance information acquisition; For coastal monitoring, Peng et al. [
31] constructed a semantic framework linking monitoring objectives, capabilities, and constraints for coastal-station optimization; and Ren et al. [
32] developed the Ocean Environmental Data Ontology and Quick Service Query List to integrate heterogeneous datasets and accelerate marine data services.
Despite these advances, several limitations remain under complex marine conditions. First, buoy deployment involves multiple interacting constraints (monitoring objectives, specifications, historical practice, and geological context), and conventional ontologies struggle to model such complex task chains in a unified and systematic way [
33,
34]. Second, existing ontology-based reasoning frameworks are mainly symbolic and lack spatially explicit logic and regional differentiation; even hybrid ontologies such as CIDOC CRMgeo, GeoSPARQL, and OWL-Time cannot capture the spatial constraints and relational rules needed for buoy siting [
35]. Third, current semantic reasoning models lack mechanisms to represent environmental dynamics or integrate feedback from changing marine conditions, leaving reasoning static and non-adaptive to evolving hydrological and meteorological factors [
36,
37,
38].
To address fragmented knowledge representation, limited rule expressiveness, and insufficient environmental adaptivity, this study proposes an adaptive buoy-siting framework integrating ontology-based reasoning with numerical computation. In this study, a buoy deployment scheme is considered reasonable if it can achieve sufficient coverage of high-variation oceanographic regions, maintain good spatial uniformity with appropriate inter-station spacing, and satisfy basic environmental and engineering constraints. To meet these criteria, the proposed framework establishes an intelligent reasoning architecture coupling semantic representation with dynamic mechanism of marine environmental response. A knowledge graph represents monitoring objectives and spatial–environmental information, while rule-based reasoning links monitoring variables to buoy configurations. The spatio-temporal comprehensive variation index (STCVI) derived from oceanographic data supports dynamic environmental perception and candidate-site filtering. For figure out layout optimization, a coverage-first greedy algorithm (CFGA) is developed to iteratively selects sites under distance constraints to balance coverage and uniformity. Accordingly, the proposed framework represents the first deep integration of ontology-driven siting rules and numerical marine modeling, bridging the methodological gap toward broader empirical and cross-domain applications of semantic reasoning in complex spatial layout decisions. The existing buoy network in the Beibu Gulf is used as the reference baseline, and the optimized layout is quantitatively evaluated through Voronoi-area variance, nearest-neighbor distance, hotspot coverage ratio, and cumulative monitoring value. The experimental results show that the proposed method effectively improves hotspot coverage and spatial uniformity relative to the existing buoy network, achieving the expected criteria of a reasonable deployment scheme.
2. An Integrated Framework for Buoy Site Selection: Leveraging Ontology Reasoning and Numerical Computation
The technological framework of the proposed adaptive optimization method integrates ontology reasoning and the numerical computation of marine environmental factors. As shown in
Figure 1, the framework consists of four modules: the knowledge modeling module, the numerical computation module, the logical reasoning module, and the optimal scheme generation module. These modules work in concert to enable the intelligent optimization of moored buoy site selection.
In the knowledge modeling module, a moored buoy site selection ontology (MBSSO) is constructed through concept extraction and property definition, to establish a semantic network supporting reasoning for buoy siting. Centered on the logical chain of “monitoring objective–site selection–buoy configuration”, core conceptual classes such as Target Sea Area, Buoy, Buoy Specification, and Functional Zoning are hierarchically defined (as detailed in
Section 3.1), along with their corresponding object and data properties (as detailed in
Section 3.2). Existing buoys and their spatial coordinates are also integrated, forming a site selection knowledge base that provides prerequisites for distance constraints, restricted-area avoidance, and coverage initialization. The built ontology is exported as RDF triples to ensure semantic consistency and to act as a unified medium for storing numerical results and invoking rules. This RDF triple set, together with the embedded monitoring objectives and existing buoy instances, serves as the input knowledge base for the subsequent numerical computation and logical reasoning modules.
In the numerical computation module, key environmental parameters, like temperature, salinity, and current velocity, are selected according to the monitoring requirement specified by the monitoring objectives. The temporal and spatial variation indices are successively calculated, and after normalization and weighted fusion, the STCVI is generated (as described in
Section 5.3). Based on this index, a hierarchical threshold filtering combined with a minimum-distance constraint is applied to identify a set of representative candidate sites (as detailed in
Section 5.4). Subsequently, the candidate sites are re-evaluated to eliminate those that fail to meet deployment requirements. As a result, an effective candidate set is obtained that not only reflects high environmental sensitivity in the numerical sense but also satisfies deployment feasibility. Using the validated candidate set as input, the coverage-first greedy algorithm (CFGA) (as described in
Section 6) is developed to determine the optimal buoy configuration that maximizes coverage marginal gain in environmentally sensitive regions. In the CFGA, minimum-distance constraint and redundancy penalty are jointly applied to suppress spatial clustering, thereby avoiding localized aggregation and maintaining overall spatial uniformity. Through iterative optimization, an optimized set of newly added buoy sites was calculated by the CFGA, achieving a smooth transition from the candidate set to the optimal layout configuration.
In the logical reasoning integration module, the optimized buoy locations and candidate instances from the numerical computation module, together with the class and property definitions provided by the ontology, serve as inputs to the reasoning rule system, which progressively enriches the key configuration attributes of each buoy. Five kinds of reasoning rules are built, including Monitoring-Element and Sensor Rule, Basic Feasibility Rule, Restricted Zone Avoidance Rule, Communication Method Rule, and Buoy Specification and Cost Rule. Within the reasoning workflow, the monitoring-element and sensor reasoning rule (a) is first applied to recommend the minimal subset of monitoring elements and their corresponding sensor configurations for each new site. Subsequently, the communication method reasoning rule (d) is invoked to infer the suitable communication options. Finally, the buoy specification reasoning rule (e) provides recommendations for buoy diameter, construction cost, and expected service life. Through this series of reasoning steps, each newly added site evolves from a simple spatial coordinate into an entity with comprehensive attributes, which is then aggregated into be aggregated into the final deployment scheme.
Section 4 demonstrates details regarding to the logical reasoning rules.
In the optimal scheme generation module, the enriched buoy instances produced by the logical reasoning module are aggregated to form the optimal layout scheme, with all newly added buoys linked to their corresponding target sea areas through the adding relationship, and the final scheme is stored in RDF format within the GraphDB (Ontotext Ltd., Sofia, Bulgaria) database. Leveraging the querying and visualization capabilities of GraphDB, users can retrieve detailed scheme information via SPARQL queries and intuitively visualize buoy distributions and configuration attributes within a graphical knowledge representation. This framework of buoy site selection also establishes a solid foundation for subsequent comparative analysis and application extension of optimized buoy deployment schemes.
4. Design of the Ontology Reasoning Rules
To enable ontology-based reasoning of moored buoy site selection, a multi-level logical reasoning rule module was constructed in accordance with the
Technical Guidelines for Demonstration of Buoy Station Site Selection (
HY/T 0357-2023) [
39] and actual engineering requirements. Five kinds of reasoning rules are defined, including Monitoring Element-Sensor Rule, Basic Feasibility Rule, Restricted Zone Avoidance Rule, Communication Method Rule, and Buoy Specification-Cost Rule. The specific definitions of the five reasoning rules are elaborated as follows.
Based on the predefined monitoring objectives and requirements,
Monitoring Element-Sensor Rule is designed to identify the relevant environmental elements which are used to calculate the spatiotemporal variation rates in the numerical computation module and filter the key monitoring factors to corresponding sensors which will be equipped with the marine buoy, achieving a coupling among monitoring tasks, environmental response, and buoy equipment configuration. The detailed reasoning workflow of Monitoring Element-Sensor Rule is illustrated in
Figure 6a.
Basic Feasibility Rule as illustrated in
Figure 6b is designed to evaluate whether each candidate sea area satisfies the fundamental deployment conditions by considering factors such as sea-area extent, average slope, seabed type, and water depth, thereby eliminating regions that are clearly unsuitable for buoy installation. Using a buffer constraint of max (3 × water depth, 1 km),
Restricted Zone Avoidance Rule as illustrated in
Figure 6c is employed to further check whether a candidate buoy siting position is against sensitive zones such as anchorages, navigation channels, or submarine pipelines, ensuring spatial safety and avoiding spatial conflict at a potential buoy siting position.
Communication mode Rule is designed to assemble the most appropriate communication mode to a buoy, according to the type of target sea area where the buoy is located (offshore areas, far sea or deep sea). As illustrated in
Figure 6d, GPRS, GSM, LoRa, microwave and CDMA are for offshore areas; BeiDou satellite (Beijing, China) is for far sea; and INMARSAT (London, UK) and Iridium satellite (McLean, VA, USA) are for deep sea. By combining deployment depth and sensor quantity,
Buoy Specification-Cost Rule as illustrated in
Figure 6e is designed to infer a suitable buoy diameter and map it to construction cost, expected service life, and material selection, in order to balance structural safety, cost efficiency, and functional adaptability of buoys.
In practice, above five kinds of reason rule for moored buoy site selection are integrated in the numerical computation module and the logical reasoning module the progressively infer the semantic attributes of candidate buoys, thereby achieving a transition from numerical computations to ontology-based configuration.
5. Numerical Computation for Generating Candidate Buoy Sites
This section develops a spatiotemporal-variation computational framework for multiple marine environmental factors to quantify the environmental responses of target sea areas. The framework calculates temporal and spatial variation indices for major oceanic variables to reveal the dynamic sensitivity of the region. Through normalization and weighted integration, the STCVI is generated, thereby achieving the structured mapping of environmental numerical features into the semantic reasoning module and providing numerical support for subsequent ontology reasoning and buoy layout optimization.
5.1. Temporal Variation Index
The temporal variation index quantifies the fluctuation amplitude of an oceanic element at a specific location during a time period. Taking temperature as an example, let
denote the observation time steps,
T denotes the temporal dimension and
representing the two-dimensional spatial coordinates. The temporal variation index of temperature at location
is defined as the standard deviation of its temporal series:
where
represents the temporal variation index of the variable at position
, and
is the observed value at time
t. This index can be computed for each grid point, allowing for scalable and parallel processing. A larger value of
indicates stronger dynamic fluctuations at location
during a time period.
5.2. Spatial Variation Index
The spatial variation index is used to characterize the spatial heterogeneity of an oceanic environmental element at a given moment by describing the deviation between a location and its neighboring region. The study sea area is discretized into a regular grid with a certain resolution. The variable value of a specific oceanic environment element at time
t is represented as
, where
refers to the grid index. A neighborhood region
centered at location
is constructed, such as a 3 × 3 grid window containing eight surrounding points and the central point itself. For this neighborhood, the standard deviation of variable values is calculated to represent the spatial variation, defined as
where
represents the variation index in the spatial dimension (denoted by
s) at location
and time
t;
denotes all grid points within the neighborhood region of the location
; and
denotes the value of a specific oceanic environmental element at grid index
and time
t. Subsequently, for the entire observation period of length
n, the overall spatial fluctuation intensity at location
is obtained by averaging all of its spatial variation indices, expressed as
The value of reflects the average spatial variation intensity over time for a given grid cell within the study area. Its higher value indicates a stronger spatial variability of marine environment at the location.
5.3. Spatiotemporal Comprehensive Variation Index
To capture the comprehensive variability of the marine environment in a sea area, the STCVI is developed by integrating the aforementioned temporal and spatial variation indices.
Firstly, to eliminate scale bias among environmental variables, the temporal and spatial variation indices are normalized independently for each variable to the interval [0, 1], as expressed by
where
and
denote the normalized temporal and spatial variation indices of the
k-th environmental variable at location
, respectively.
After normalization, a linear weighted model was employed to integrate the variation information from all environmental factors. This model yields a composite variability index
, which is defined in Equation (6).
Here, k corresponds to marine environmental factors such as temperature, salinity, and current velocity; represents the weight assigned to each factor; and and denote the weights of the temporal and spatial variation indices, respectively.
The STCVI W (x, y) serves as a normalized composite variability index, and larger values indicate locations with stronger dynamic fluctuations and higher spatiotemporal variability, which reflects greater environmental complexity and higher information representativeness.
5.4. Filtering Buoy Candidate Sites Based on the STCVI
Figure 7 illustrates the overall workflow for filtering potential buoy station locations, which is achieved by applying the STCVI and reasoning of the Basic Feasibility Rule and Restricted Zone Avoidance Rule (see
Section 4).
To ensure spatial uniformity while providing sufficient high-sensitivity candidates for subsequent greedy site selection, a multi-quantile threshold strategy is adopted. First, four quantile values of the STCVI—70%, 60%, 50%, and 40%—are calculated, based on which four nested threshold regions R70, R60, R50, R40 are defined, satisfying R70 ⊂ R60 ⊂ R50 ⊂ R40. A lower threshold corresponds to a larger spatial coverage.
Within each region R, all grid cells are scanned in descending order of STCVI to ensure the priority selection of high-sensitivity areas. In each round, candidate points are required to satisfy two constraints: (1) the distance between any two candidates selected in the same round must not be smaller than the parameter inner_min_km to avoid excessive local clustering; and (2) the distance between each candidate and existing buoy sites must not be smaller than near_old_km to prevent spatial overlap. Through four independent rounds of screening, four subsets C70, C60, C50, C40 are obtained. Since the quantile rounds are executed independently and the threshold range is progressively relaxed, the high-value core areas and their adjacent regions are repeatedly examined across rounds, resulting in a richer and more diversified distribution of high-STCVI candidates and overcoming the limitation of a single threshold that selects only one representative point from each high-value cluster.
Finally, the four subsets are merged to form the comprehensive candidate set C = C70 ∪ C60 ∪ C50 ∪ C40 and coordinate-precision deduplication (tolerance ≤ 1 × 10−4°) is performed to filter duplicate or overly close candidate sites. This process statistically achieves multi-round coverage of high-STCVI regions and spatially uniform distribution across the study area, providing high-quality input for subsequent logical reasoning and optimization.
6. Coverage-First Greedy Algorithm for Site-Selection Computation
To simultaneously consider coverage efficiency and spatial uniformity as two core objectives for buoy station optimization, the CFGA is proposed to resolve the buoy siting problem as a constrained maximum-coverage problem. The CFGA aims to maximize monitoring coverage while suppressing redundant deployments, thereby generating an optimal set of newly added buoys that satisfies the requirements of the monitoring tasks.
6.1. Coverage Function and Redundancy Penalty
- (1)
Coverage function
Let the candidate site set be
, representing all potential buoy locations generated by the preprocessing module in
Section 5. Let
denote the set of existing buoy stations in the target sea area. Let
be the set of K additional buoys chosen from C. Define a weighted grid set
, where each grid point
is assigned a monitoring weight
equaling its STCVI. For any candidate
and grid point
, denote their distance by
; The distance between two candidate sites
and
is denoted as
; And the distance between a candidate point
and an existing buoy
is denoted as
in kilometers.
Let the coverage contribution of buoy
to grid point
be denoted by
, which characterizes the local coverage effect of
. It is defined as:
where
R is the coverage radius. Nearer buoys contribute more coverage to nearby grid points; the contribution decays linearly with distance and becomes zero once the distance exceeds
R.
Accordingly, the coverage function
of grid point
is defined to describe its final coverage level under the combined effect of the existing buoy set
and the newly added buoy set
S. The definition is given as
This definition indicates that the coverage degree of each grid point depends on the selected buoy set and is dominated by the nearest buoy. In other words, each grid point only retains the maximum coverage value without accumulation, thereby avoiding the repeated calculation of monitoring benefits.
- (2)
Redundancy penalty
To prevent excessive local clustering of buoys and improve the efficiency of monitoring resource utilization, a redundancy penalty term based on the nearest-neighbor distance is introduced. Let the minimum distance between buoy
p and its neighboring buoys in
S\{P} be
where
d(
p,
q) denotes the distance between two buoys. Accordingly, the redundancy penalty function of buoy p is defined as
where
is the penalty coefficient. This function constrains the buoy distribution through pairwise distance regulation. When the nearest-neighbor distance satisfies
, the penalty value
, indicating a sufficiently sparse distribution with no constraint imposed. Conversely, when
, the penalty value increases quadratically as the distance decreases, and reaches the maximum
when two buoys coincide. Therefore, while maintaining overall coverage, this penalty effectively suppresses local redundancy and enhances the spatial uniformity of buoy distribution.
6.2. Objective Function and Constraints
Let the binary variable
indicate whether the candidate site
is selected. Then the selected set can be expressed as
. The objective function is defined as
where the first term represents the weighted coverage gain contributed by the newly added buoy set, and the second term introduces a penalty function to suppress excessive local clustering. This formulation enhances overall coverage efficiency and improves spatial uniformity, achieving a balance between coverage performance and deployment evenness.
The optimization is subject to the following constraints:
Equation (16) ensures that the number of newly added buoys is fixed at K; Equation (17) guarantees that the minimum distance between any two new buoys is no less than Dmin; and Equation (18) ensures that the minimum distance between each new buoy and any existing buoy is also no less than Dmin.
When
= 0, the objective function degenerates into a typical monotone submodular coverage function [
40]. Adopting a stepwise greedy selection strategy, in which the candidate point with the largest marginal gain is iteratively added, a (1 − 1/e) approximate optimal solution can be obtained. After incorporation the penalty term
, the function is no longer strictly submodular; however, it effectively prevents excessive buoy clustering and improves the spatial uniformity of the overall buoy distribution.
6.3. Greedy Optimization Procedure
To approximate the solution of the objective function in Equation (15), a stepwise greedy heuristic algorithm is adopted. Each candidate buoy is evaluated by its marginal gain, defined as the weighted coverage increment of all grid points minus the redundancy penalty from spatial proximity. The coverage vector, representing the current coverage status of the sea grid, is iteratively updated after each buoy addition. The algorithm continues selecting the buoy with the highest marginal gain until the deployment scale or termination criterion is reached. The complete workflow of the algorithm is illustrated in
Figure 8.
At the initialization stage, the coverage radius R_KM is defined to characterize the spatial influence range of buoy monitoring capability, and the minimum distance D_MIN is set to prevent excessive proximity between new and existing buoys. The total number of new buoys K_NEW determines the final deployment scale, while the penalty coefficient balances the trade-off between coverage efficiency and spatial uniformity. Subsequently, the initial coverage vector c(Ɛ) is computed from the existing buoy set Ɛ and used as the starting point for iteration.
During the iterative process, a feasibility check is first conducted to remove candidate points that are closer than
Dmin to any existing or already selected buoy, thereby ensuring a reasonable spatial distribution. For the remaining feasible candidates, each candidate point is individually evaluated for its marginal contribution to the overall objective function if added to the current buoy set. Denoting the current coverage vector as
, the marginal gain of a candidate
p is defined as
where the first term represents the weighted coverage increment contributed by adding buoy
p, and the second term is the redundancy penalty, which suppresses overly dense placement. Among all feasible candidates, the point with the maximum marginal gain,
p*, is selected and added to the new buoy set
S. The coverage vector is then updated as
The updated coverage vector serves as the basis for the next iteration. The algorithm continuously iterates through the cycle of screening → evaluation → selection → updating, until the number of new buoys reaches K_NEW or no candidate point yields a positive marginal gain. The resulting set S of newly added buoys thus represents the optimal site layout scheme. It is worth noting that this algorithm does not depend on the existence of prior buoy distributions; when historical deployments are unavailable, it can independently generate a complete layout configuration. Therefore, it is equally applicable to both appending deployments (building upon existing buoys) and independent network planning scenarios, ensuring consistent suitability for comparative spatial layout analyses.
6.4. Generation of the Reasoning-Computation Integrated Layout Scheme
After the greedy optimization process determines the final set of newly added sites, each site is instantiated as a candidate buoy individual, and the reasoning rule system is sequentially activated to complete its semantic attributes. Specifically, each new buoy is instantiated within the RDF triple model, and its attribute information—corresponding to monitoring objectives and environmental conditions—is automatically generated through reasoning chains involving monitoring-element identification, sensor configuration, communication method selection, and buoy specification and cost inference.
Subsequently, all candidate buoy instances and their reasoning results are uniformly written into the semantic knowledge graph, forming a structured Optimal Layout Scheme. This scheme not only reflects the spatial distribution of optimized buoy locations but also integrates multi-dimensional attributes such as monitoring elements, sensor types, and communication methods inferred from reasoning, establishing semantic associations with the target sea-area individuals. Through this process, a deep coupling between optimization results and ontology-based knowledge representation is achieved.
Finally, the integrated results are deployed on the GraphDB platform for storage, visualization, and interactive querying, thereby establishing a closed-loop mechanism that links spatial layout, logical reasoning, and semantic representation. This integration provides a unified data-knowledge foundation for subsequent layout evaluation, decision support, and operational management.
7. Experiments and Analysis
7.1. Study Area and Data
The Beibu Gulf, located in the northern South Sea (longitude range 108.1° E–110° E, latitude range 19.5° N–22.5° N), is selected as the target sea area for simulation-based experimental validation of the proposed method. Multi-source marine environmental datasets are used in the verification experiments. The experimental data sources include bathymetric data and marine environmental data. Bathymetric information is derived from the ETOPO1 Global Relief Model [
41], which offers integrated global topographic and ocean depth data.
Figure 9 shows the spatial distribution of water depth in the Beibu Gulf. This dataset serves as a key input for feasibility assessment and the buoy diameter recommendation rule, supporting both site-selection evaluation and buoy specification inference. Marine environmental variables are obtained from the Hybrid Coordinate Ocean Model (HYCOM) [
42], which provides global numerical simulation data at a spatial resolution of 1/12°. Based on these datasets, the temporal variation rate, spatial variation rate, and the final STCVI are calculated by the procedures described in
Section 5. Since this study focuses on marine hydrological observations, temperature, salinity, and current velocity are selected as the STCVI variables because they are the core elements governing ocean water-mass structure, density stratification, and dynamic transport. These three parameters represent the primary monitoring targets of most hydrological buoy systems and therefore provide a task-driven basis for STCVI construction. The spatial distribution of the resulting STCVI (for the period 11 August–31 December 2024) is shown in
Figure 10. The results indicate high sensitivity along Guangxi coast and western Guangdong sector, while the inner gulf and the waters approaching northern Hainan show substantially weaker variation.
7.2. Experiments
Two experimental setups, the appending layout and the independent layout, are designed to validate the effectiveness of the proposed method under different initial conditions. All experiments are conducted within a unified algorithmic and environmental framework. To obtain reasonable and robust hyperparameter settings, we performed multiple rounds of preliminary tests on key parameters, including the coverage radius, minimum separation threshold, redundancy penalty coefficient, and STCVI weights. Within predefined search ranges, these parameters were systematically evaluated to assess their influence on layout performance, and repeated runs were averaged to reduce randomness. The final hyperparameter values represent a balanced choice between performance and stability. As summarized in
Table 3, all hyperparameters are kept identical across experiments to ensure comparability.
- (1)
Appending Layout Experiment
In this experiment, a set of 10 buoy location on the current buoy configuration in the Beibu Gulf (simulated) is taken as the initial station set ε. Usually, the appending layout mode is adopted for the layout design of an existing observation network. Starting from the initial set ε, the CFGA is executed, by iteratively adding one new buoy. The objective of the CFGA is to maximize the overall coverage improvement and enhance spatial uniformity while maintaining the continuity of historical observations.
For the appending deployment optimization of buoy sites, the number of new buoys,
K_NEW, needs to be determined in addition to the parameters listed in
Table 3.
K_NEW is calculated using the procedure described below. According to the marginal gain function defined in Equation (20), the contribution of the
k-th candidate point
pk,
, can be evaluated. The set of hotspot grid cells,
H, is defined based on the STCVI field value, where the top 10% of high-value grids are extracted. Let
Sk denote the set formed after adding the newly selected buoys. To quantitatively characterize the overall coverage performance, the cumulative coverage ratio is calculated by
During the iteration process, both the
and
are recorded and shown in
Figure 11 and
Figure 12. When the marginal gain of newly added buoys drops below 10% of that in the first iteration, and the cumulative coverage ratio
exceeds 95%, the layout is considered to have reached a saturation stage, and the iteration is terminated. Experimental results indicate that both criteria are satisfied at
k = 4; thus, the final number of new buoys is determined as
K_NEW = 4.
The final layout computed by the CFGA consists of 10 existing buoys and 4 additional buoys, forming an observation network with 14 stations in total, as shown in
Figure 13. It can be observed from
Figure 13 that the newly added buoys are mainly distributed in the previously uncovered regions and areas with high STCVI values, which significantly enhances the coverage density of hotspot regions and improves the overall spatial uniformity of the network.
- (2)
Independent Layout Experiment
To validate the method’s performance in spatial planning scenarios devoid of prior deployments or constraints, an independent layout experiment is carried out. This configuration commences with an empty set as the initial condition, presuming no existing buoys in the target sea area. Under the same environmental data and parameter settings, an optimal buoy layout is independently generated. The algorithmic process is identical to that of the appending mode, adopting the CFGA, whereas the initial state and the determination of K_NEW differ.
The independent layout, as shown in
Figure 14, is generated by executing the CFGA with no pre-existing buoys, producing 10 optimized buoy sites (
K_NEW = 10) to ensure comparability with the existing network. In the independent layout, the buoy stations are regularly distributed, forming a balanced observation network along the northern and northeastern coastal zones and extending toward the central basin of the Beibu Gulf.
7.3. Optimize Performance Assessment
To systematically evaluate the performance of the proposed optimization algorithm under different initial conditions, this section conducts a quantitative comparison of the appending layout and independent layout schemes with the original existing layout. The evaluation examines spatial uniformity, station spacing, and monitoring effectiveness based on four metrics: Voronoi polygon area standard deviation, nearest-neighbor distance, hotspot coverage and cumulative monitoring value. The Voronoi polygon area standard deviation (Voronoi STD) is adopted to measure spatial uniformity, with a smaller value indicating more consistent subdivision of monitoring units. The nearest-neighbor distance (NND) is the minimum value among the distances between all adjacent stations. NND reflects station independence and sparsity: a larger minimum value of NND suggests lower redundancy risk, a larger mean value of NND implies more uniform spacing, and a larger maximum value of NND reflects broader spatial coverage. The hotspot coverage ratio (HCR) evaluates monitoring sufficiency in environmentally sensitive zones by calculating the proportion of the top 10% STCVI grids covered by at least one buoy. The cumulative monitoring value (CMV) is the sum of STCVI values from all buoy locations and their coverage areas. It serves as an integrated measure of overall observational benefit. The specific evaluation results are as follows.
- (1)
Voronoi STD
As shown in
Figure 15, the layout of buoy stations is divided by Voronoi polygons. The existing layout (
Figure 15a) exhibits a coexistence of buoy clusters and uncovered areas, especially near the marginal areas of the gulf, leading to substantial differences in Voronoi polygonal areas (STD = 2415.31 km
2). After optimization, spatial uniformity improves significantly: the appending layout enhances uniformity to 1620.23 km
2 by filling previously uncovered zones, while the independent layout further alleviates extreme disparities and achieves a more balanced distribution with an STD of 2252.65 km
2. These results confirm that, when compared to the original existing layout, both optimized schemes effectively mitigate the problem of uneven buoy density.
- (2)
Nearest-neighbor Distance
As illustrated in
Figure 16, in the existing layout, the minimum NND is 12.51 km, the mean one is 30.03 km, and the maximum one reaches 64.84 km, reflecting the coexistence of locally clustered stations and regions with insufficient spacing. In the appending layout, although the minimum NND remains unchanged, the mean and maximum distances increase to 36.20 km and 75.98 km, respectively. This indicates that the newly added buoys extend the network toward outer areas while alleviating local congestion, thereby improving spatial extensibility. By contrast, the independent layout yields a more uniformly spaced configuration, with NND values of 35.74 km (minimum), 48.68 km (mean), and 58.87 km (maximum), suggesting a more balanced station arrangement and a more reasonable overall spacing pattern.
- (3)
Hotspot Coverage Ratio
As shown in
Figure 17, the existing layout covers only 21.07% of the hotspot regions, indicating insufficient monitoring capability in high-sensitivity areas. The appending layout substantially improves the coverage ratio to 63.24% through the addition of new buoys, while the independent layout achieves the highest coverage of 94.22%, nearly encompassing all hotspot regions. When compared with the existing layout, both optimized schemes markedly enhance the observation coverage in environmentally sensitive zones.
- (4)
Cumulative Monitoring Value
As presented in
Figure 18, the CMV of the existing layout is merely 47.32, while that of the appending layout increases to 99.95, nearly doubling the observation effectiveness. The independent layout achieves the highest value of 113.21, indicating the strongest overall performance.
In summary, the results of the four-evaluation metrics demonstrate that the proposed semantic reasoning–numerical computation integration mechanism is of the capacity to effectively optimize buoy site selection under different initial conditions. The appending layout strengthens the observation effectiveness while maintaining network continuity, whereas the independent layout achieves the optimal performance under unconstrained conditions.
7.4. Semantic Output and Visualization
Considering the economic and operational constraints in practical marine observation, this section focuses on the appending layout scheme, to present the semantic reasoning and visualization results of the five newly added buoys within the appending layout scheme. For each buoy, semantic attributes are automatically completed according to its geographic position and environmental characteristics, by invoking the five core reasoning rules (as described in
Section 4). Through the reasoning processes, each buoy evolves from a spatial coordinate into a semantically enriched observation unit, whose attributes included equipment configuration, communication mode, buoy specification, and construction cost. All outputs are written back to the knowledge graph database in RDF format and linked to an Optimal Layout Scheme individual through adding object properties, forming a structured deployment scheme.
Table 4 gives the newly added buoy stations associated with semantic information. The Optimal Layout Scheme individual is stored in the GraphDB platform and users can retrieve detailed attributes of each buoy station—such as station ID, geographic coordinates, monitoring elements, and configuration schemes—through SPARQL queries. The SPARQL query is shown in
Figure 19. The querying results can be visualized in a graph-based view, as shown in
Figure 20. In
Figure 20, nodes represent entities such as buoys, sensors, and communication modes, and edges denote their semantic relationships. The visualization process provides a unified mapping from numerical optimization results to structured semantic knowledge.
The semantic reasoning and visualization of the appending layout further confirms the consistency between numerical optimization and semantic knowledge representation. By establishing a unified mapping from quantitative results to ontological entities, it proves that the proposed framework achieves full interoperability between data computation and semantic inference.
7.5. Sensitivity Analysis and Comparative Experiments
7.5.1. Sensitivity Analysis
During the CFGA optimization process, the weights assigned to the STCVI components (temperature, salinity, and current velocity) and the minimum deployment distance Dmin serve as adjustable hyperparameters. Their values may influence hotspot capture performance, spatial uniformity of station distribution, and monitoring-value gain. To evaluate the influence of these parameters within reasonable ranges and to examine the stability of the optimization method, this section conducts two sensitivity experiments—STCVI weight perturbation and variation in the minimum separation threshold—under the appending mode, with all other settings kept constant.
- (1)
STCVI weight perturbation
Table 5 summarizes the evaluation results under three sets of STCVI weights. For the configurations (0.25, 0.25, 0.50), (0.30, 0.30, 0.40), and (0.35, 0.35, 0.30) assigned to temperature, salinity, and current velocity, respectively, both hotspot coverage and cumulative monitoring value exhibit moderate variations, ranging from 61.50% to 72.10% and from 95.11 to 107.09, respectively. In contrast, the Voronoi area standard deviation and the mean nearest-neighbor distance remain stable, consistently falling within 1559–1620 km
2 and 33–36 km, without noticeable fluctuations. These results indicate that within the weight perturbation range considered, the STCVI weights influence coverage-related metrics to some extent, whereas their impact on the overall spatial configuration of buoy stations is limited. The optimized layouts maintain similar geometric patterns across different weight settings.
- (2)
Variation in the Minimum Distance Threshold
Figure 21 illustrates the sensitivity of layout performance to the minimum separation threshold
Dmin over the range of 20–50 km. The results show that when
Dmin ≤ 45 km, all performance metrics remain virtually unchanged: the mean NND stays at approximately 36.2 km, the Voronoi area STD remains stable at about 1865 km
2, the hotspot coverage maintains a level of 92.1%, and the cumulative monitoring value stabilizes around 99.95. This pronounced “plateau region” indicates that the minimum-distance constraint is not activated within this range; instead, the optimization is primarily driven by coverage gain and redundancy penalty, while the inherent spatial distribution of candidate sites already ensures sufficient dispersion. Thus, the spacing constraint is not redundant but is implicitly satisfied through the joint effects of candidate-point geometry and the optimization objective, demonstrating strong robustness of the algorithm within reasonable parameter settings. When
Dmin increases to 50 km, the constraint begins to exert a noticeable influence: the mean NND rises to 36.85 km, the Voronoi STD decreases to 1750 km
2—indicating forced spatial spreading of buoy locations—while the CMV slightly decreases to 98.8, suggesting a loss of monitoring benefit under stricter spacing requirements. Overall, CFGA exhibits very weak sensitivity to
Dmin in the range of 20–45 km. The impact of this parameter is expected to become more pronounced in denser candidate sets or smaller spatial domains.
7.5.2. Performance Comparison Between CFGA and Traditional Algorithms
To comprehensively assess the performance of the proposed CFGA, this section compares it against three representative baseline methods: Random [
43], KMeans [
44], and GA [
15], under the appending layout scenario. The Random strategy provides an unstructured sampling baseline and serves as a lower performance bound under the minimum-distance constraint. KMeans is a classical spatial-clustering approach widely used in monitoring network deployment to achieve spatial balance. GA represents a conventional global heuristic search framework commonly adopted in observation-network optimization. To ensure fairness, all four methods operate on the same candidate set
C and share the parameter settings specified in
Figure 3. The optimization objective and constraints are also kept consistent across methods; only the search strategy differs.
Figure 22 presents the results of the four methods across nine performance metrics. In terms of coverage quality, CFGA and GA achieve significantly higher CMV and HCR than Random and KMeans, indicating a stronger ability to capture high-STCVI regions. Both CFGA and GA reach a CMV of 99.95, whereas KMeans and Random yield only 65.30 and 78.69, respectively, reflecting their clear limitations in identifying high-variability areas. Regarding spatial uniformity, the minimum nearest-neighbor distance (NND_min) is identical across all methods due to the imposed spacing constraint. Because the experiment preserves the existing buoy layout, NND_min is fixed by the closest pair in the original network, which is already below
Dmin. This inherited spacing dominates the metric and therefore remains identical across all methods. However, CFGA and GA exhibit notably bigger NND_mean and NND_max than Random and KMeans, demonstrating their ability to expand coverage while maintaining adequate inter-point spacing. It is worth noting that the number of stations added by KMeans is constrained by the spatial separability of the candidate sites; in this experiment, it converged to only two cluster centers, failing to reach the preset four additional stations. This reflects its dependence on spatial distribution patterns, making it difficult for KMeans to guarantee a fixed number of new placements under complex spatial configurations, unlike CFGA.
The computing-time complexity of the four methods is analyzed and compared as follows. Let |C| denote the number of candidate sites, K_NEW the number of newly added stations, t the number of KMeans iterations, and Pop and Gen the population size and evolutionary generations of GA, respectively. In this experiment, t is kept as a small constant, while Pop and Gen are set considerably larger than K_NEW to ensure sufficient diversity and convergence of the genetic algorithm. Random requires only a single sampling over the candidate set, with a complexity of O(|C|). The computing cost of KMeans is dominated by clustering and iterative refinement; each iteration assigns all candidate sites to K_NEW clusters and updates the cluster centers, resulting in a complexity of approximately O(|C|·K_NEW·t). Because t remains a small constant, its overall computing time increases mainly linearly with K_NEW. GA evaluates the fitness of Pop individuals in each generation, and each individual encodes K_NEW newly added stations; thus, each generation incurs a complexity of O(|C|·Pop), and the total time complexity over Gen generations is O(|C|·Pop·Gen). Under the parameter settings of this experiment, Pop and Gen are much larger than K_NEW and t, leading to substantially higher overall computing cost for GA. In contrast, CFGA computes the marginal coverage gain of all candidate sites in each selection round and performs K_NEW deterministic rounds, yielding a complexity of O(|C|·K_NEW). Because O(|C|) < O(|C|·K_NEW) < O(|C|·K_NEW·t) < O(|C|·Pop·Gen), Random yields the lowest computing-time complexity, GA exhibits the highest, and KMeans is more computationally expensive than CFGA. As a deterministic search procedure, CFGA produces stable outputs without dependence on random initialization, thereby avoiding the solution variability commonly observed in GA.
Overall, CFGA achieves a favorable balance between computational efficiency and optimization capability, demonstrates superior performance over Random and KMeans, and performs comparably to GA in terms of coverage capability and spatial uniformity, while maintaining substantially lower computational complexity. Without requiring complex hyperparameters or large-scale iterations, CFGA achieves stable and efficient station-deployment optimization, making it a more suitable core solution strategy for practical ocean-monitoring applications.
7.6. Limitations and Practical Considerations
Although the above experiments have, to some extent, verified the significant advantages of the proposed moored buoy site optimization method in terms of marine environmental adaptability, spatial coverage, and distribution uniformity, certain limitations still remain in its practical engineering applications for ocean buoys. First, the STCVI is derived from HYCOM reanalysis data at a 1/12° resolution, which may underrepresent small-scale coastal dynamics and seasonal or meteorological fluctuations, introducing uncertainty into the variation estimates. Second, the ontology reasoning module depends on a predefined rule set; when applied to other regions, adaptations may be required to accommodate different legal, environmental, and engineering constraints. Third, the current optimization primarily targets hotspot coverage and spatial uniformity, while practical factors such as construction cost, navigational safety, and deployment logistics are not yet fully incorporated. Finally, due to the lack of comparable long-term buoy deployment records in the study area, empirical validation of the optimized layouts remains limited. In this study, the existing buoy network in the Beibu Gulf serves as a practical baseline for evaluating the rationality of the optimized layouts. Future work will extend the comparison to real deployment practices in similar sea areas and incorporate operational feedback to further validate and refine the framework.
8. Conclusions and Future Work
This study developed an intelligent buoy site selection approach that integrates semantic reasoning with numerical computation, validated through simulation experiments of moored-buoy deployment in the Beibu Gulf, northern South Sea. The results demonstrate the method’s high practical utility and scalability in generating candidate stations, configuring buoys, and optimizing layouts. The ontology-based reasoning mechanism automates the adaptation of buoy configurations (including communication, sensors, and structure), thereby providing robust methodological support for the intelligent planning of marine buoy monitoring systems. Concurrently, the coverage-first greedy algorithm, driven by numerical computation of marine environmental variants, performs the computational solution of the calculation of the buoy layout optimization problem. This integrated semantic–numerical framework for optimizing ocean buoy siting significantly enhances monitoring coverage and spatial uniformity of the ocean buoy observation network, which directly improves the representativeness and quality of observational data for more accurate marine condition assessments and forecasts. Furthermore, this systematic approach mitigates the risk of redundant deployments, potentially reducing the long-term construction and maintenance costs of ocean buoy observation networks. It also lays a technical foundation for developing a new generation of intelligent, adaptive observation networks capable of dynamically aligning with specific monitoring tasks and changing marine conditions. However, the current reasoning rules still depend on manually defined configurations, which limits their flexibility and adaptability. Additionally, several numerical parameters in the current framework—such as the STCVI component weights and the minimum separation distance—are empirically specified and may not fully capture variations across different monitoring objectives. To mitigate this limitation, future extensions will explore two complementary directions. On the one hand, machine-learning-based mechanisms will be investigated to automatically discover or refine semantic rules from historical buoy deployments and environmental observations, thereby reducing reliance on fixed rule engineering and improving the adaptability of parameter settings to varying monitoring tasks. On the other hand, statistical analysis and optimization-based strategies—such as sensitivity-informed calibration, multi-objective optimization, or Bayesian estimation—will be implemented to automatically determine key hyperparameters, including the STCVI weights and the minimum separation threshold. This integrated approach is expected to enhance the framework’s transferability and robustness across diverse marine regions and monitoring objectives. These efforts are expected to further promote the practical and intelligent evolution of ocean buoy network planning methodologies.