1. Introduction
Vector spatial data constitute a fundamental component of geographic information systems (GISs), typically represented by point, line, and polygon features. From a geometric perspective, both line and polygon features can be expressed as ordered coordinate sequences, with polygons essentially forming closed curves. Together, line and polygon data constitute a substantial proportion of spatial datasets and are often characterized by high vertex density and complex geometric structures. With the continuous growth of large-scale geographic databases, the storage, management, and transmission of vector line data have become increasingly challenging under constraints of limited storage capacity and network bandwidth. Therefore, efficient compression of vector curve data, while preserving essential geometric characteristics, has become a critical issue in geographic information science. Traditionally, the reduction of vector data volume has been closely associated with multi-scale cartographic representation. In map production, the same geographic feature must be represented at varying levels of detail depending on map scale, and geometric simplification is commonly applied when transitioning from large-scale to small-scale representations. The Douglas–Peucker (DP) algorithm is a classical approach that iteratively removes vertices whose perpendicular distance from the baseline connecting the start and end vertices falls below a predefined threshold, and it has been widely used due to its simplicity and effectiveness [
1]. Numerous studies have further investigated and extended the DP algorithm. Yang et al. proposed a recursive implementation and introduced radial distance as a supplementary constraint to control area error while preserving curve smoothness [
2]. They also pointed out that the lack of topological constraints in the classical DP algorithm may lead to self-intersections and that uncertainty in threshold selection can significantly affect simplification results. Subsequently, Li et al. refined the threshold search range and improved simplification performance by preserving key bends in polyline geometry [
3]. Yu et al. enhanced DP-based simplification by selecting convex vertices of natural shorelines as candidate vertices and identifying segmentation vertices using angular and distance criteria, thereby improving both compression ratio and curve accuracy [
4]. Beyond DP-based methods, other strategies have been proposed to achieve data reduction. Hybrid approaches combining genetic algorithms with local search strategies have been developed to balance shape preservation and compression efficiency [
5]. However, the use of fixed coding lengths in such methods may result in increased storage and computational costs when applied to dense curves. Resampling-based techniques represent another class of methods, where curves are compressed by introducing new vertices located on or near the original geometry. The Li–Openshaw algorithm, for example, simulates human visual perception by retaining fine details at close observation distances while preserving only macroscopic shapes at smaller scales [
6]. Function-fitting approaches provide a different perspective by representing line features with compact parametric forms. Fourier series-based methods [
7,
8,
9,
10] and wavelet-based fitting techniques [
11,
12] decompose curve geometry across multiple scales, preserving primary structural information while discarding secondary details.
Although these traditional methods are effective for scale-dependent representation, they are primarily designed for cartographic generalization, where data reduction serves visualization and map readability purposes. In such methods, compression ratios are usually determined indirectly by scale-related thresholds or perceptual criteria, rather than being explicitly optimized for storage efficiency. Moreover, most of these approaches are inherently lossy and irreversible, making it difficult to accurately reconstruct original geometric features from compressed representations. These limitations restrict their applicability in storage-oriented compression scenarios, where compact encoding and faithful geometric restoration are equally required for data archiving, transmission, and reuse in GIS databases. With the rapid development of deep learning, data-driven approaches have attracted increasing attention in spatial data processing and offer new possibilities for learning geometric representations directly from data. Neural network-based methods have been applied to polyline simplification and geometric feature extraction, demonstrating the feasibility of learning-based strategies for curve processing [
13]. Convolutional neural networks have been employed to identify curve features after rasterization, generating candidate regions and removing redundant segments [
14]. Other studies have explored supervised and unsupervised learning frameworks for vector data compression and reconstruction. Encoder–decoder architectures have been used for rasterized road integration [
15], while generative adversarial networks have been applied to reconstruct simplified vector roads from raster images [
16]. Autoencoder-based models have further been investigated for multi-level curve simplification and vertex pooling, enabling effective simplification of polylines and building outlines [
17,
18]. Fully connected autoencoders have been applied to curve data compression and reconstruction, confirming the capability of neural networks to capture morphological features for efficient data reduction [
19].
While early deep learning methods have laid a solid foundation for spatial data processing, the past few years have witnessed a significant paradigm shift toward more complex architectures, particularly Graph Neural Networks (GNNs) and Transformers [
20]. Transformers, driven by their robust self-attention mechanisms, have been increasingly adapted to extract global geometric features directly from vector lines and polygons [
21]. Furthermore, hybrid architectures, such as Conv-Transformers, have been proposed to successfully combine local convolutional operations with global context modeling for vector polygon analysis and change detection [
22]. Concurrently, GNNs have demonstrated exceptional capability in handling non-Euclidean spatial data, explicitly capturing complex topological relationships and matching patterns across multi-scale vector spatial lines [
23].
However, despite the powerful global semantic extraction capabilities of GNNs and Transformers, these architectures often introduce substantial computational overhead and structural complexity. For the specific task of compressing island boundary vector data—which inherently consists of continuous, sequentially ordered coordinate points—the primary objectives are to minimize direct positional deviation (PD) and maintain high-fidelity macroscopic geometric shapes. In this context, a targeted convolutional autoencoder (CAE) remains highly advantageous. By leveraging one-dimensional convolutions, the proposed CAE efficiently exploits the inherent linear sequence and local spatial correlations of the boundary curves. This approach achieves precise sequence-to-sequence reconstruction and optimal compression fidelity without the redundant complexity associated with complete graph construction or full-attention modeling.
This study proposes a 1D CNN-based lossy compression framework specifically tailored to vector curve data. A key contribution of this work is the explicit demonstration that 1D convolutional architectures are more effective at capturing the local geometric structures essential for compression than fully connected autoencoders (FCAs) and Fourier-based approaches by efficiently capturing spatially localized geometric features. While validated using island boundaries as a representative case, this framework is inherently applicable to other 2D vector polylines, such as contours and hydrological networks, which share similar sequential characteristics. While Fourier methods prioritize global outlines and FCAs lack spatial locality mechanisms, our architecture leverages 1D convolutional layers to efficiently learn spatially localized geometric features, leading to more compact latent representations and higher reconstruction accuracy. We validate this framework using island boundary data as a representative case of polylines. Experimental results confirm that by preserving critical geometric metrics such as positional deviation (PD), relative area error (RAE) and relative perimeter error (RPE), the proposed method achieves superior compression performance, providing a robust and efficient solution for storage-oriented vector data management in GISs.
2. Model and Methods
To address storage-oriented compression of vector curve data with controlled geometric distortion, this study reformulates vector curve compression as a representation learning problem under explicit capacity constraints, rather than a scale-driven geometric simplification task. From a theoretical perspective, the core objective of vector curve compression is to construct a compact parametric representation that preserves essential geometric information while minimizing storage cost, subject to a bounded reconstruction error. In this sense, compression can be interpreted as an optimization problem that balances representational compactness and geometric fidelity.
Vector curves are inherently ordered one-dimensional geometric signals, whose information content is characterized by both local geometric variations and global structural properties. Traditional cartographic generalization and rule-based simplification methods reduce data size by applying explicit geometric heuristics or scale-dependent thresholds, which implicitly determine compression outcomes and offer limited control over the trade-off between storage efficiency and reconstruction accuracy. In contrast, a representation learning perspective enables this trade-off to be explicitly optimized by constraining the capacity of a learned latent space and minimizing reconstruction distortion in a data-driven manner. Within this formulation, vector curve compression is naturally cast into an encoding–decoding paradigm, where the encoder learns a compact latent representation that captures the intrinsic geometric structure of the curve, and the decoder reconstructs the curve from this representation. The dimensionality and information capacity of the latent space directly correspond to the achievable compression ratio, thereby providing an explicit and interpretable mechanism for storage-oriented compression. Based on the geometric characteristics of vector curves, a hierarchical representation strategy is adopted. Local geometric continuity and fine-scale variations along the curve are modeled through convolutional operations applied to unify curve sequences, while global shape information and long-range dependencies are captured through fully connected layers that perform holistic aggregation and compact encoding. By operating directly on vector representations without rasterization, the proposed framework preserves geometric precision and is particularly suited for high-fidelity compression of complex geographic curves.
2.1. Problem Formulation and Curve Representation
Vector curve data in GISs are typically represented as ordered sequences of two-dimensional coordinates that describe the geometric shape of linear or boundary features. Given a vector curve C = , where the vertices are ordered along the curve, the objective of vector curve compression is to reduce the amount of data required to store and transmit C while preserving its essential geometric characteristics. In storage-oriented compression scenarios, the goal differs fundamentally from scale-driven cartographic generalization. Rather than selectively removing vertices based on perceptual or scale-dependent thresholds, storage-oriented compression seeks to encode the original curve into a compact representation z such that the reconstructed curve can be faithfully recovered from z under controlled geometric distortion. This process can be formulated as an encoding–decoding problem, where an encoder function maps the original curve C to a latent representation z = , and a decoder function reconstructs the curve as .
A key challenge in learning-based vector curve compression lies in the variable length and non-uniform sampling of vector curves. Geographic curves often exhibit irregular vertex spacing and highly variable geometric complexity, which makes it difficult to directly apply neural network models that require fixed-length inputs. To address this issue, the proposed framework adopts a unified curve representation through segmentation and resampling. Specifically, in the context of vector data, this point-level resampling utilizes an equidistant interpolation method to redistribute a fixed number of coordinate vertices evenly along the geometric path of each curve segment. This process transforms curves consisting of an arbitrary number of points into the consistent, fixed-dimensional vector representations required as the neural network inputs, while rigorously preserving their original geometric continuity and spatial characteristics. This enables the learning of compact encodings across curves with diverse shapes, point densities, and lengths.
To enhance the modeling of geometric structure, vector curves are represented using coordinate differences rather than absolute coordinates. Specifically, the curve is expressed as a sequence of incremental displacements between consecutive vertices, which emphasizes local geometric variation and reduces sensitivity to absolute position. This representation facilitates the learning of translation-invariant geometric patterns and supports efficient reconstruction by cumulatively restoring vertex positions from the decoded displacement sequence. Based on this formulation, vector curve compression is treated as a representation learning problem in which the encoder learns a compact latent representation that captures both local geometric details and global curve structure. The reconstruction quality is evaluated by comparing the original curve C and the reconstructed curve
using the geometric error measures defined in
Section 3.3. This formulation provides a unified foundation for the convolutional autoencoder-based compression framework described in the following sections.
2.2. Data Preprocessing
Vector spatial line data are commonly represented as ordered sequences of Cartesian coordinates, expressed as , where each point records its spatial location along the curve. While absolute coordinates describe positional information, the geometric characteristics of a curve—such as direction changes and local shape variations—are more effectively captured by the differences between adjacent vertices. Therefore, the present study represents curve geometry using coordinate increments, i.e., incremental changes in Cartesian coordinates between consecutive vertices. Specifically, for each curve segment, the coordinate differences are computed between adjacent vertices. By recording the absolute coordinates of the first point and cumulatively summing the coordinate increments, the original curve geometry can be fully reconstructed. This representation emphasizes local geometric variation while reducing sensitivity to absolute position, making it well suited for learning-based compression. Based on this principle, coordinate increments are employed as feature vectors and input into the convolutional autoencoder model for training. Neural network models typically require inputs of fixed length and consistent structure. However, island boundary data exhibit substantial variability in curve length, shape complexity, and point distribution. To address these challenges and ensure data consistency, a preprocessing pipeline consisting of curve segmentation, curve resampling, and data normalization is applied, as described below.
From a representation learning perspective, this differential encoding transforms vector curves into locally structured geometric signals, enabling the model to focus on intrinsic shape characteristics rather than absolute spatial placement. This formulation establishes a geometric foundation for subsequent hierarchical feature learning, where local variations are first modeled and later integrated into higher-level representations.
2.2.1. Curve Segmentation
In the context of geometric representation learning, segmentation serves not only as a means of input unification, but also as a mechanism to control the spatial scope over which local geometric patterns are learned. By decomposing long and heterogeneous curves into segments of comparable length, the model is encouraged to capture consistent local geometric variations within a bounded context, which facilitates stable convolutional feature extraction. Island boundary curves vary significantly in total length, and excessive variation in segment length can introduce large discrepancies in geometric complexity, thereby increasing the difficulty of neural network training. To reduce this effect, each boundary curve with total length is divided into multiple sub-segments with a target length . In general, if , the curve is initially divided into sub-segments of length , leaving a remaining portion of length . In this study, vector curves are represented as polylines composed of ordered vertices connected by straight line segments, which is the uniform format of most GIS vector line datasets. Consequently, curve length is computed as the cumulative Euclidean distance between consecutive vertices. Since the final sub-segment may not meet the target length, two cases are considered to handle the remainder:
If , the remaining portion is sufficiently short and is merged with the k-th segment. In this case, the adjusted length of the k-th segment becomes .
If , the remaining portion is retained as an additional segment, denoted as the -th segment, with length .
This segmentation strategy ensures that segment lengths remain relatively consistent while avoiding excessively short segments, thereby preserving the geometric characteristics of the original boundary curve.
It is acknowledged that fixed-length segmentation may potentially weaken the explicit modeling of long-range dependencies and global outline continuity across entire boundary curves. However, in the proposed framework, this limitation is mitigated at subsequent representation stages, where segment-level features are aggregated through global encoding layers to reconstruct holistic curve structures. As a result, segmentation functions as a controlled decomposition for local geometric learning, rather than a loss of global shape information. Regarding the segmentation strategy, it is important to note that the chosen segment length (and the subsequent equidistant resampling to a fixed number of vertices) is designed to be scale-invariant. Rather than relying on absolute geographic distances—which would be highly dependent on the unknown cartographic scale of the original data—the segmentation size is determined by the necessity to capture sufficient local geometric context. Specifically, the segment length is calibrated to match the receptive field of the CAE, ensuring that each input tensor contains meaningful topographical variations while avoiding excessive structural complexity. This allows the network to learn pure shape features and localized curvatures effectively, independent of the absolute spatial scale.
2.2.2. Curve Resampling
After segmentation, individual curve segments may still differ in the number of vertices they contain. Since neural networks require inputs of uniform size, resampling is applied to unify point distributions across segments. In addition, uneven vertex spacing within a segment can adversely affect model training. Resampling serves to transform irregularly sampled vector curves into uniformly parameterized geometric signals. By enforcing a consistent sampling scheme, the model is guided to learn shape characteristics based on intrinsic curve geometry rather than artifacts introduced by uneven vertex distributions. For a curve segment with length
, the segment is resampled into
equal intervals with a sampling distance of
, producing
uniformly distributed vertices. The coordinate differences between the consecutive resampled vertices are then computed, resulting in a sequence of
incremental vectors
, where
. This procedure yields a fixed-length representation that preserves the overall geometric structure of each segment while providing a consistent parameterization suitable for convolutional feature extraction. The segmentation and resampling process is illustrated in
Figure 1.
2.2.3. Data Normalization
In addition to improving numerical stability during training, normalization plays an important role in reducing scale-related bias and enhancing the model’s ability to learn geometry-driven patterns rather than magnitude-dependent variations. To improve the efficiency and stability of neural network training, the coordinate increment features are normalized prior to model input. Specifically, the coordinate differences are scaled to the range [0, 1] using min–max normalization, as expressed in Equation (1):
where
and
denote the minimum and maximum values of the x-coordinate increments within a segment, and
and
denote the corresponding values for the y-coordinate increments. By ensuring consistent feature scaling across curve segments, normalization facilitates more balanced representation learning and improves the comparability of geometric patterns across different segments.
2.3. Convolutional Autoencoder-Based Compression Model
2.3.1. Convolutional Autoencoder
For vector curve data represented as ordered one-dimensional geometric sequences, effective compression requires modeling both local geometric variations and their spatial continuity along the curve. Convolutional neural networks [
24] provide a natural mechanism for capturing such localized geometric patterns through shared-weight filtering over bounded neighborhoods. The general computational model of a CNN is expressed in Equation (2). In the context of coordinate-difference sequences, convolution operates as a localized geometric filter that aggregates directional and magnitude changes within a sliding window, enabling the network to learn patterns related to curvature, smoothness, and directional continuity.
where
denotes the input,
denotes the convolution kernel,
denotes the bias,
denotes the activation function, and
the output. The symbol
denotes the convolution operation.
Through successive convolutional layers with appropriate strides, the encoder progressively aggregates local geometric information into more compact representations, effectively reducing representational dimensionality while preserving salient geometric structures.
Autoencoders provide an unsupervised learning framework for dimensionality reduction and data reconstruction and have been widely adopted in representation learning [
25]. An autoencoder consists of two main components: an encoder and a decoder. The encoder maps high-dimensional input data—formulated as coordinate-difference sequences in our work—into a compact latent representation z, while the decoder reconstructs the input data from z. In the context of this study, the input is specifically represented as a high-dimensional sequence of coordinate differences. By minimizing the reconstruction error between the input and output, the autoencoder learns a low-dimensional representation that preserves the essential geometric characteristics of the original curve. In this formulation, the reconstruction error can be interpreted as a measure of geometric distortion, enabling the compression process to be optimized under an explicit fidelity constraint.
A convolutional autoencoder (CAE) extends this framework by employing convolutional layers in the encoder to exploit local spatial dependencies in the coordinate-difference sequences, and transposed convolutional layers in the decoder, to recover the original resolution. Compared with purely fully connected autoencoders, CAEs require fewer parameters and are more effective at preserving local geometric structures along the curve. However, convolutional layers alone are insufficient to perform global dimensionality reduction into a compact latent vector. Therefore, fully connected (FC) layers are introduced between the convolutional encoder and decoder to bridge this gap.
Specifically, the encoder applies one-dimensional convolutional layers to extract local geometric features from the input ( sequence. The resulting feature maps are then flattened and passed through one or more fully connected layers, which perform global feature aggregation and dimensionality compression, producing the latent vector z. This FC-based bottleneck explicitly controls the dimensionality of the latent space and serves as the core mechanism for curve compression. In the decoder, a symmetric set of fully connected layers first expands z back into a high-dimensional feature representation, which is subsequently reshaped and fed into transposed convolutional layers to reconstruct the coordinate-difference sequence (.
The overall architecture of the proposed convolutional autoencoder is illustrated in
Figure 2. Coordinate differences (
are first computed from the original curve and used as the network input. Convolutional layers extract local features, followed by fully connected layers that compress the representation into a low-dimensional latent vector z. The decoder mirrors this process by using fully connected layers to restore global structure and transposed convolutions to recover local details, yielding reconstructed differences (
with the same dimensionality as the input. Finally, the original curve coordinates are recovered by cumulatively summing the reconstructed differences. If the discrepancy between (
and (
remains within an acceptable threshold, the latent vector z can be regarded as a valid compressed representation of the original curve segment.
Overall, the proposed convolutional autoencoder establishes a hierarchical geometric representation framework, where local geometric variations are first captured through convolutional encoding and subsequently consolidated into a compact global representation, enabling efficient and high-fidelity vector curve compression.
2.3.2. Data Compression with a Convolutional Autoencoder
Convolutional neural networks (CNNs) are particularly effective in capturing local structural patterns while maintaining parameter efficiency, making them well suited for modeling sequential geometric data. Depending on the dimensionality of the input, convolutional operations can be implemented in different forms. In the present study, the training samples consist of ordered sequences of coordinate increments derived from segmented and resampled curves, which naturally correspond to one-dimensional signals. Accordingly, a one-dimensional convolutional autoencoder (CAE) is adopted to perform vector curve compression. A key challenge in CAE-based compression lies in controlling the dimensionality of the latent representation using convolutional layers alone. To explicitly regulate the compression ratio, a fully connected layer is introduced after the final convolutional layer in the encoder to map high-dimensional feature maps into a compact latent vector. To maintain architectural symmetry, a corresponding fully connected layer is placed at the beginning of the decoder. The overall architecture of the proposed model is illustrated in
Figure 3. The encoder consists of two one-dimensional convolutional layers followed by a fully connected layer, whereas the decoder mirrors this structure using a fully connected layer and two transposed convolutional layers. The encoder outputs a low-dimensional latent vector that serves as the compressed representation of the curve segment, while the decoder reconstructs the coordinate increment sequence from this latent encoding. Since the input features are normalized to the range [0, 1], the Rectified Linear Unit (ReLU) activation function is employed in all hidden layers. Reconstruction accuracy is measured using the root mean square error (RMSE), which is adopted as the loss function during training. The proposed CAE operates on generic curve representations derived from coordinate sequences and does not depend on semantic attributes of specific geographic objects, making it applicable to a wide range of vector line data.
To ensure sufficient overlap between convolutional receptive fields while preserving feature continuity, the stride of both the convolutional and transposed convolutional layers is set to half of the kernel size. Empirical experiments demonstrate that a convolution kernel size of 1 × 7 yields optimal performance (
Section 5.1), while a segment length of 25 km provides the best trade-off between geometric consistency and compression efficiency (
Section 5.2). The first convolutional layer contains 16 channels and the second 32 channels, with the decoder adopting a symmetric configuration. The input feature vector has a fixed length of 200, which is compressed into a latent vector of dimension
n, where
n can be adjusted to achieve different compression ratios.
After training, the compressed representation of a curve dataset consists of four components: (1) the latent vectors produced by the encoder, (2) the starting coordinates of each curve segment, (3) the four normalization parameters
,
,
for inverse transformation, and (4) the trained model parameters (weights and biases). The model parameters are shared across all curve segments and are therefore stored only once. Although they formally belong to the compressed dataset, their contribution to the overall storage cost becomes negligible when the number of segments is sufficiently large. In practical experiments, the model size ranges from several tens to a few hundreds of kilobytes, which is minor compared with the size of the original vector data. The compression ratio (CR) is defined as the ratio between the size of the compressed data and that of the original curve data, as shown in Equation (3):
where
denotes the compression ratio,
is the number of curve segments,
is the average number of sampled vertices per segment,
is the dimensionality of the latent vector, the constant 6 in the denominator represents the overhead of storing starting coordinates (2 values) and normalization metadata (4 values) for each segment and
represents the total number of model parameters. When M is sufficiently large, the influence of W can be ignored, leading to the simplified expression shown in the second part of Equation (3).
2.3.3. Restoration of Island Boundary Data
Curve reconstruction begins by reversing the normalization applied during preprocessing, thereby restoring the coordinate increments to their original numerical ranges. The inverse normalization is defined in Equation (4):
where
and
denote the minimum and maximum normalized x-coordinate increments within a curve segment, and
and
represent the corresponding values for the y-coordinate increments.
Due to reconstruction errors introduced during encoding and decoding, the cumulative endpoint of a reconstructed segment may not coincide exactly with that of the original curve, leading to boundary non-closure. This issue is particularly critical for island boundaries, which must satisfy strict geometric closure constraints. To address this problem, a closure error correction procedure is applied. Let denote the original coordinate increments and the reconstructed increments. The cumulative closure errors in the x- and y-directions are computed as: , , To enforce closure, these cumulative errors are evenly distributed across all increments, yielding correction terms: The corrected increments are then obtained: . Finally, the corrected coordinate increments are scaled by the corresponding segment length , and the absolute coordinates of each point are reconstructed sequentially starting from the stored initial coordinate of the segment.
Together, data preprocessing, convolutional autoencoder-based compression, and geometric restoration form a complete compression–decompression framework. Within this framework, the encoder functions as a compression module that transforms long coordinate sequences into compact latent vectors, which can be stored as independent compressed files. The decoder serves as a decompression module that reconstructs vector spatial data from these encodings. Although the encoder and decoder are jointly trained, they can be deployed independently in practical applications, enabling efficient storage of island boundary data and reliable reconstruction when needed.
4. Experimental Results
4.1. Model Accuracy
To illustrate model performance, a compression ratio of 3.33 was selected as an example.
Figure 5 shows the variation in model loss and mean point displacement under this compression ratio. The X-axis represents the number of training epochs, where each epoch corresponds to one complete forward and backward propagation over the training dataset. The right Y-axis represents model loss. The results show that the loss decreases rapidly within the first 20 epochs, followed by oscillations around a stable value, indicating overall convergence. Training and testing losses remain close, suggesting that the model does not suffer from overfitting. The left Y-axis represents the mean point displacement, measured in meters, which reflects the positional accuracy of the model in practical applications. Similarly to the loss curve, the mean point displacement decreases sharply during the first 20 epochs and then gradually declines with minor fluctuations between epochs 20 and 200.
4.2. Impact of Compression Ratios
After training, the learned model parameters were saved and used to reconstruct island boundary segments for quantitative accuracy evaluation. To investigate the relationship between compression ratio and reconstruction accuracy, the dimensionality of the latent feature vector was set to 100, 80, 60, 40, and 20, corresponding to compression ratios of 2, 4, 3.33, 5, and 10, respectively. Here, the compression ratio is defined as the ratio between the latent vector length and the original input vector length. For each compression setting, the decoder was applied to reconstruct the compressed features, and the mean positional deviation was computed. The reconstruction results obtained using the best-performing model parameters are summarized in
Table 1.
As shown in
Table 1, when the compression ratio is set to 2, 2.5, or 3.3, the mean positional deviation remains within the range of approximately 40–50 m. In contrast, further increasing the compression level leads to a noticeable degradation in reconstruction accuracy. At a compression ratio of 5, the mean displacement increases to 62.107 m, and at compression ratio 10, it rises sharply to 117.831 m. These results indicate that excessive reduction of the latent feature dimension significantly weakens the model’s ability to preserve fine-scale geometric details, resulting in increased positional distortion.
To provide a qualitative assessment of reconstruction performance,
Figure 6 presents the reconstructed island boundary curves at a map scale of 1:100,000 under different compression ratios. At a compression ratio of 2, the reconstructed curves exhibit only minor deviations from the original boundaries, which are visually negligible. At a 3.33 compression ratio, local deviations become more apparent, yet the overall curve shape remains visually consistent with the original. In contrast, at a compression ratio of 10, pronounced geometric distortions can be observed, further confirming the decline in reconstruction fidelity at high compression levels. These visual results are consistent with the quantitative accuracy analysis and demonstrate the trade-off between compression efficiency and geometric accuracy.
4.3. Comparative Results and Discussion
To evaluate the performance of different curve compression strategies, the part of the island boundary dataset described in
Section 3.1 was compressed using four methods: the proposed convolutional autoencoder (CAE), the Fourier series-based method (FS), the fully connected autoencoder (FCA), and the classical Douglas–Peucker (DP) algorithm. Reconstruction accuracy under varying compression ratios is quantitatively compared in
Figure 7 using the mean positional deviation metric defined in
Section 3.3.
The results indicate that at lower compression levels (CR = 2 and 2.5), both the Fourier series method and the DP algorithm achieve slightly lower mean positional deviation than the convolutional autoencoder. For the FS method, this behavior can be attributed to the strong representation capability of low-frequency Fourier components for smooth and slowly varying curves; for the DP algorithm, retaining a relatively large subset of original vertices naturally preserves the geometric outline with high fidelity. However, as the compression ratio increases (CR = 3.33 and above), the reconstruction accuracy of both the FS and DP methods deteriorates rapidly. The DP algorithm, in particular, exhibits a dramatic exponential increase in positional deviation, reaching nearly 400 m at CR = 10, making it the least effective method under severe compression. This severe degradation occurs because DP relies on point decimation; at extreme compression ratios, discarding the vast majority of vertices forces the algorithm to bridge large gaps with long straight-line segments, leading to significant geometric distortion in non-feature areas. In contrast, the convolutional autoencoder maintains relatively stable performance. Under these higher-compression scenarios, the CAE consistently outperforms the other three approaches, demonstrating superior robustness and global shape awareness. Across all tested compression ratios, the CAE also achieves lower reconstruction errors than the FCA, confirming the effectiveness of local convolution operations in extracting spatial curve features.
Figure 8 provides a visual comparison of reconstructed island boundary curves at compression ratios of 2, 3.33, and 10, displayed at a map scale of 1:50,000. Each row compares the four methods under the same compression ratio, while each column illustrates the effect of different compression levels for the same geographic region. The original curve is shown in black, while reconstructions generated by the convolutional autoencoder (CAE), Fourier series method (FS), fully connected autoencoder (FCA), and the Douglas–Peucker (DP) algorithm are shown in red, green, blue, and yellow, respectively. To highlight structural differences, four representative local geometric features—a prominent cape with rapid directional change (a), a smoothly varying coastline (b), and a deep concave bay (c)—were selected for visualization.
At a lower compression ratio of 2, all four methods are able to reconstruct island boundaries with relatively small deviations from the original curves, and visual differences are generally subtle. The DP algorithm, in particular, tightly hugs the original boundary since a sufficient number of critical vertices are retained. However, as the compression ratio increases to 3.33 and further to 10, reconstruction errors become increasingly pronounced, revealing the distinct characteristics of each compression strategy.
Under high compression (CR = 10), the convolutional autoencoder (CAE) consistently produces reconstructed curves that remain closest to the original geometry across all curve types. This advantage is evident not only for flat segments but also for concave and convex structures, which contain more complex local geometric variations. In stark contrast, the DP algorithm exhibits severe geometric degradation. Because DP relies strictly on point decimation and linear interpolation, forcing it into a high compression ratio results in aggressive pruning of essential shape descriptors. This is most prominently illustrated in
Figure 8c, where the DP algorithm completely loses the geometric outline of the deep bay by bridging it with a single straight line, and in
Figure 8a, where it abruptly truncates the cape.
The fully connected autoencoder (FCA) also exhibits larger deviations, especially in regions with rapid directional changes, suggesting limited effectiveness in preserving local geometric continuity. Notably, at lower compression ratios (e.g., CR = 2), the Fourier series (FS) method remains highly competitive. This is likely because the global, low-frequency shape of many island segments can be effectively captured by the first few Fourier coefficients, whereas the CAE might be learning more complex and potentially redundant features at these lower compression levels. However, as the compression becomes more aggressive (CR = 10), the FS method shows substantial degradation. When reconstructing concave and convex shapes, the higher-frequency components required to represent local curvature are progressively discarded in the Fourier approach, leading to an over-smoothed, inward-shrinking effect seen in panels (a) and (c).
These visual observations consistently validate the quantitative metrics presented earlier, highlighting the superiority of the convolutional autoencoder in preserving complex, high-frequency geographic details through localized convolutional operations and shared parameters, making it highly robust for extreme compression of intricate vector curves.
5. Analysis and Discussion
5.1. Geometric Scale Sensitivity of the Convolutional Autoencoder
In convolutional neural networks (CNNs), overall performance is jointly determined by network architecture, layer configuration, and hyperparameter selection. Among these components, the convolutional kernel plays a central role in feature extraction by defining the size of the local receptive field. Through element-wise multiplication and summation with the input signal, convolutional kernels capture localized patterns and their sliding operation aggregates such local features into higher-level representations. For one-dimensional sequential data, such as the coordinate increment sequences used in this study, kernel size determines the geometric scale at which local variations along the curve are perceived by the model. From a geographic perspective, this process can be interpreted as analogous to the progressive scanning and abstraction of spatial features in cartographic generalization, where local geometric details are perceived within a limited contextual window. If the receptive field is too small, the kernel may fail to capture meaningful shape patterns beyond minor point-to-point fluctuations; if it is too large, fine-scale geometric variations may be overly smoothed or obscured. Therefore, an appropriate kernel size is essential for balancing sensitivity to local geometry and robustness to noise.
To examine the influence of kernel size on compression performance, a series of experiments were conducted using kernel sizes of {4, 7, 9, 11, 13} under a fixed compression ratio of 3.33. Reconstruction accuracy was evaluated using the mean positional deviation metric defined in
Section 3.3. The results are summarized in
Table 2. Among the tested configurations, a kernel size of 1 × 7 achieved the lowest mean positional deviation, indicating the most effective balance between local feature extraction and geometric continuity preservation under the current experimental setup.
It should be emphasized that this result does not imply a universally optimal kernel size for all types of vector curve data. Rather, the observed performance reflects an empirical alignment between the kernel receptive field and the characteristic geometric scale of the input curves. Specifically, a kernel size of 7 may correspond to capturing the typical length of a salient bend or geometric primitive in a resampled curve segment. For other vector line datasets with different geometric complexities, sampling densities, or segmentation strategies, the optimal kernel size may vary accordingly. Nonetheless, these results demonstrate that moderate kernel sizes, which capture short-range geometric dependencies without excessively expanding the receptive field, are generally well suited for learning compact representations of vector curve data.
5.2. Effect of Segment Length on Compression Stability and Accuracy
Segment length determines the geometric extent of curve information presented to the autoencoder at each training instance and therefore has a direct impact on representation learning and reconstruction accuracy. From a modeling perspective, segment length controls the balance between geometric completeness and structural variability: shorter segments emphasize fine-scale local details, whereas longer segments incorporate broader contextual shape information but also introduce greater morphological complexity. To investigate the influence of segment length on compression performance, four segment lengths 15 km, 25 km, 50 km, and 75 km—were evaluated under a fixed compression ratio of 3.33. Reconstruction accuracy was assessed using the mean positional deviation metric described in
Section 3.3 The experimental results are summarized in
Table 3. As shown in
Table 3, reconstruction accuracy improves as the segment length increases from 15 km to 25 km, indicating that very short segments may not contain sufficient geometric context for the autoencoder to effectively learn characteristic curve patterns. When the segment length is further increased to 50 km and 75 km, however, reconstruction accuracy deteriorates noticeably. This decline suggests that overly long segments introduce excessive geometric variability, making it more difficult for the model to encode diverse shape patterns into a fixed-length latent representation without loss of detail.
These results indicate the existence of an intermediate segment length that provides an effective trade-off between geometric representativeness and model learnability. In the present experimental configuration, a segment length of approximately 25 km yields the lowest mean positional deviation. This distance likely represents the optimal spatial scale that encapsulates a complete and representative geometric feature without introducing excessive structural complexity that would overwhelm the autoencoder’s capacity. It should be noted that this value is not intended to serve as a universal optimal length. Instead, it reflects a dataset- and model-dependent balance between local geometric continuity and global shape complexity. For other types of vector curves or different sampling densities, the optimal segment length may vary accordingly. Nevertheless, the observed trend highlights the importance of aligning segment length with the representational capacity of the autoencoder when designing compression frameworks for vector curve data.
5.3. Compression Ratio Recommendations for Multi-Scale Island Visualization
Reconstruction of vector curves using the decoder inevitably introduces geometric deviations from the original data, and these deviations generally increase as the compression ratio rises. As demonstrated in
Table 1, higher compression ratios lead to larger mean positional deviations. In practical cartographic applications, however, the acceptability of such deviations depends not only on their absolute magnitude in ground units but also on the target map scale and the limits of human visual perception.
In cartography and map visualization, it is widely recognized that positional deviations smaller than approximately 0.2 mm on a printed or displayed map are difficult for human observers to reliably perceive under typical viewing conditions. This empirical threshold is commonly adopted as a practical criterion for evaluating geometric accuracy in multi-scale mapping. By converting this map-space tolerance into ground-space distances for different map scales, acceptable deviation thresholds can be derived and used to guide compression ratio selection.
Table 4 summarizes the recommended compression ratios for vector curve data at different map scales based on this visual discernibility criterion. For large-scale maps (1:100 K and finer), even relatively small positional deviations in ground space may exceed visual tolerance, indicating that aggressive compression is not suitable in such contexts. At medium-to-small scales, however, the allowable ground displacement increases substantially, making higher compression ratios feasible without introducing perceptible visual distortion. Specifically, for map scales around 1:250 K, a compression ratio of approximately 30% satisfies the visual accuracy requirement, while at 1:500 K the compression rate can be further increased to about 5. For small-scale representations such as 1:1 M, compression ratios as high as 10 remain visually acceptable. These results demonstrate that compression ratio selection should be scale-dependent and that substantial storage savings can be achieved at smaller map scales without compromising visual fidelity.
It should be emphasized that the recommended compression ratios reported here are derived from empirical experiments under the current model configuration and dataset characteristics. While the specific numerical values may vary for different vector curve types or modeling strategies, the underlying principle—that compression decisions should be jointly guided by reconstruction accuracy and target visualization scale—remains generally applicable. This scale-aware compression strategy provides a practical framework for integrating deep learning-based vector data compression into multi-scale geographic information systems.
5.4. Performance Comparison of Different Models
To comprehensively evaluate the effectiveness of the proposed 1D CAE-based compression framework, the model is compared against three baseline methods: a fully connected autoencoder (FCA), a traditional curve-fitting method based on Fourier Series (FS), and the classical Douglas–Peucker (DP) algorithm. To ensure a fair and representative comparison, the experimental results are evaluated at a specific compression ratio of 3.33 and summarized in
Table 5.
Positional and spatial accuracy: The proposed CAE model demonstrates superior performance in spatial fidelity, achieving the lowest positional deviation (PD = 42.41) and the highest intersection over union (IoU = 0.9991). In contrast, the classical DP algorithm exhibits the highest positional deviation (PD = 81.98), as discarding a large number of vertices introduces significant point-to-line projection errors in non-feature areas. Furthermore, while the DP algorithm achieves the lowest relative area error (RAE = 0.0014%) by strictly retaining critical structural vertices that define the macro-boundary of the closed islands, the CAE remains highly competitive (RAE = 0.0067%) and significantly outperforms the other learning-based methods. This confirms that 1D convolutions effectively exploit the local spatial correlations of adjacent vertices, thereby preserving the overall spatial extent much better than the FCA, which lacks local spatial inductive bias, and the FS, which tends to over-smooth local details.
Morphological and perimeter features: Regarding morphological preservation, the traditional DP algorithm yields the lowest curvature change (CC = 4.60 × 10−5) and relative perimeter error (RPE = 0.7576%). This is because DP simplifies boundaries into straight-line segments between critical nodes, naturally minimizing complex local curvature variations and preserving the discrete polygonal perimeter. Among the remaining spectral and learning-based approaches, the traditional FS method performs well in preserving curve smoothness due to its inherent characteristic of fitting global outlines in the frequency domain, yielding a low curvature change (CC = 2.06 × 10−4) and relative perimeter error (RPE = 0.9538%). However, the proposed CAE remains highly competitive in these metrics (RPE = 0.9919%, CC = 3.22 × 10−4) and is significantly superior to the FCA. The FCA performs the worst in preserving geometric shape (RPE = 1.8345%, CC = 3.52 × 10−4), primarily because flattening the data for fully connected layers disrupts the spatial sequential continuity of the 1D coordinate sequence, leading to unnatural geometric jitters during reconstruction.
Computational efficiency: Regarding computational cost, traditional algorithmic and spectral methods like DP and FS do not require an amortized neural network (i.e., no training time is incurred), though they may incur different processing overheads during execution (e.g., complex frequency-domain transformations for FS). Among the deep learning models, simple matrix multiplications allow the FCA to be slightly faster (51.94 s) during training. However, the CAE (57.32 s) achieves a substantial improvement in reconstruction accuracy at a negligible computational trade-off. Overall, the results indicate that the proposed CAE achieves the optimal comprehensive trade-off between geometric fidelity, spatial accuracy, and computational efficiency for vector curve compression.
6. Conclusions and Future Work
This study proposes a convolutional autoencoder-based framework for the compression of vector curve data and demonstrates its effectiveness through systematic experiments on complex vector line data. By transforming coordinate sequences into coordinate-increment representations, the proposed encoder learns compact latent vectors that preserve essential geometric characteristics, while the decoder reconstructs curves with high positional fidelity. Experimental results confirm that the proposed approach achieves a favorable balance between compression efficiency and reconstruction accuracy. The trained encoder–decoder parameters are shared and distributed across a large number of curve segments and therefore do not scale with dataset size, making the proposed approach suitable for storage-oriented compression scenarios.
Comparative analyses with the classical Douglas–Peucker (DP) algorithm, Fourier series-based (FS) compression, and fully connected autoencoders (FCA) indicate that the CAE provides consistently superior performance across a wide range of compression ratios. While traditional heuristic methods like the DP algorithm achieve high fidelity at low compression levels by directly retaining critical vertices, their strict reliance on point decimation leads to severe geometric distortion and structural degradation under aggressive compression. On the other hand, while frequency-domain methods may achieve a slightly higher accuracy at relatively low compression levels, the proposed CAE exhibits clear advantages under moderate-to-high compression, where nonlinear local geometric features become more difficult to preserve using traditional parametric representations. Furthermore, by explicitly exploiting the sequential nature of boundary coordinates, the proposed architecture avoids the substantial computational overhead and structural complexity associated with Graph Neural Networks (GNNs) or Transformer-based models. The results further demonstrate that convolutional structures are more effective than fully connected architectures in capturing local geometric continuity in vector curve data.
Through a series of sensitivity analyses, this study highlights the importance of scale alignment between model design, data representation, and application requirements. The experiments reveal that both the receptive field of convolutional kernels and the geometric extent of curve segments significantly influence reconstruction performance, and that intermediate scales yield the most effective trade-offs between representational capacity and model stability. In addition, by linking reconstruction accuracy to cartographic visual tolerance, a scale-aware compression strategy is established, providing practical guidance for selecting compression ratios in multi-scale map visualization scenarios. The proposed method is particularly suitable for small- and medium-scale mapping applications, where substantial data reduction can be achieved without perceptible visual degradation.
Despite these promising results, several limitations remain. The experimental evaluation primarily focuses on a single category of vector curves (i.e., closed island boundaries), and the current framework emphasizes geometric reconstruction without explicitly modeling topological consistency or semantic attributes. Future work will extend the proposed approach to a broader range of vector datasets, including transportation networks and hydrographic features, and will investigate topology-aware constraints and multi-task learning strategies to further improve reconstruction robustness and generalization. Furthermore, integrating adaptive segmentation and scale-aware model configurations represents a promising direction for enhancing the flexibility of deep learning-based vector data compression in real-world geographic information systems.