A Convolutional Autoencoder-Based Method for Vector Curve Data Compression

Zhang, Shuo; Liu, Pengcheng; Ma, Hongran; Guo, Mingwu

doi:10.3390/ijgi15040164

Open AccessArticle

A Convolutional Autoencoder-Based Method for Vector Curve Data Compression

¹

Hubei Province Key Laboratory for Geographical Process Analysis & Simulation, Central China Normal University, Wuhan 430079, China

²

College of Urban and Environmental Sciences, Central China Normal University, Wuhan 430079, China

³

School of Resource and Environmental Sciences, Wuhan University, Wuhan 430079, China

⁴

Geomatics Institute, Wuhan 430022, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2026, 15(4), 164; https://doi.org/10.3390/ijgi15040164

Submission received: 18 December 2025 / Revised: 5 April 2026 / Accepted: 7 April 2026 / Published: 11 April 2026

Download

Browse Figures

Versions Notes

Abstract

(1) Background: Curve data compression plays a critical role in efficient storage, transmission, and multi-scale visualization of vector spatial data, especially for complex geographic boundaries. Achieving high compression efficiency while preserving geometric fidelity remains a challenging task. (2) Methods: This study proposes a vector curve compression framework based on a convolutional autoencoder. Curve data are segmented and resampled to unify network input, after which coordinate-difference sequences are encoded into low-dimensional latent vectors through convolutional layers and reconstructed via a symmetric decoder. (3) Results: Experiments conducted on a global island boundary dataset demonstrate that the proposed method achieves effective data reduction with stable reconstruction accuracy. Specifically, compared with the classical Douglas–Peucker (DP) algorithm, Fourier series (FS) methods, and fully connected autoencoders (FCAs), the 1D CAE exhibits superior and more robust reconstruction performance, especially under high compression ratios. It achieves the lowest positional deviation (PD = 42.41) and the highest spatial fidelity (IoU = 0.9991, with a relative area error of only 0.0067%), while maintaining high computational efficiency (57.32 s). Sensitivity analyses reveal that a convolution kernel size of 1 × 7 and a segment length of 25 km yield the optimal trade-off between representational capacity and model stability. (4) Conclusions: The proposed method enables efficient vector curve compression and reliable coastline reconstruction, and is particularly suitable for small- and medium-scale cartographic applications up to a map scale of 1:250 K.

Keywords:

curve data compression; convolutional autoencoder; vector curve reconstruction; geometric feature extraction; vector spatial data

1. Introduction

Vector spatial data constitute a fundamental component of geographic information systems (GISs), typically represented by point, line, and polygon features. From a geometric perspective, both line and polygon features can be expressed as ordered coordinate sequences, with polygons essentially forming closed curves. Together, line and polygon data constitute a substantial proportion of spatial datasets and are often characterized by high vertex density and complex geometric structures. With the continuous growth of large-scale geographic databases, the storage, management, and transmission of vector line data have become increasingly challenging under constraints of limited storage capacity and network bandwidth. Therefore, efficient compression of vector curve data, while preserving essential geometric characteristics, has become a critical issue in geographic information science. Traditionally, the reduction of vector data volume has been closely associated with multi-scale cartographic representation. In map production, the same geographic feature must be represented at varying levels of detail depending on map scale, and geometric simplification is commonly applied when transitioning from large-scale to small-scale representations. The Douglas–Peucker (DP) algorithm is a classical approach that iteratively removes vertices whose perpendicular distance from the baseline connecting the start and end vertices falls below a predefined threshold, and it has been widely used due to its simplicity and effectiveness [1]. Numerous studies have further investigated and extended the DP algorithm. Yang et al. proposed a recursive implementation and introduced radial distance as a supplementary constraint to control area error while preserving curve smoothness [2]. They also pointed out that the lack of topological constraints in the classical DP algorithm may lead to self-intersections and that uncertainty in threshold selection can significantly affect simplification results. Subsequently, Li et al. refined the threshold search range and improved simplification performance by preserving key bends in polyline geometry [3]. Yu et al. enhanced DP-based simplification by selecting convex vertices of natural shorelines as candidate vertices and identifying segmentation vertices using angular and distance criteria, thereby improving both compression ratio and curve accuracy [4]. Beyond DP-based methods, other strategies have been proposed to achieve data reduction. Hybrid approaches combining genetic algorithms with local search strategies have been developed to balance shape preservation and compression efficiency [5]. However, the use of fixed coding lengths in such methods may result in increased storage and computational costs when applied to dense curves. Resampling-based techniques represent another class of methods, where curves are compressed by introducing new vertices located on or near the original geometry. The Li–Openshaw algorithm, for example, simulates human visual perception by retaining fine details at close observation distances while preserving only macroscopic shapes at smaller scales [6]. Function-fitting approaches provide a different perspective by representing line features with compact parametric forms. Fourier series-based methods [7,8,9,10] and wavelet-based fitting techniques [11,12] decompose curve geometry across multiple scales, preserving primary structural information while discarding secondary details.

Although these traditional methods are effective for scale-dependent representation, they are primarily designed for cartographic generalization, where data reduction serves visualization and map readability purposes. In such methods, compression ratios are usually determined indirectly by scale-related thresholds or perceptual criteria, rather than being explicitly optimized for storage efficiency. Moreover, most of these approaches are inherently lossy and irreversible, making it difficult to accurately reconstruct original geometric features from compressed representations. These limitations restrict their applicability in storage-oriented compression scenarios, where compact encoding and faithful geometric restoration are equally required for data archiving, transmission, and reuse in GIS databases. With the rapid development of deep learning, data-driven approaches have attracted increasing attention in spatial data processing and offer new possibilities for learning geometric representations directly from data. Neural network-based methods have been applied to polyline simplification and geometric feature extraction, demonstrating the feasibility of learning-based strategies for curve processing [13]. Convolutional neural networks have been employed to identify curve features after rasterization, generating candidate regions and removing redundant segments [14]. Other studies have explored supervised and unsupervised learning frameworks for vector data compression and reconstruction. Encoder–decoder architectures have been used for rasterized road integration [15], while generative adversarial networks have been applied to reconstruct simplified vector roads from raster images [16]. Autoencoder-based models have further been investigated for multi-level curve simplification and vertex pooling, enabling effective simplification of polylines and building outlines [17,18]. Fully connected autoencoders have been applied to curve data compression and reconstruction, confirming the capability of neural networks to capture morphological features for efficient data reduction [19].

While early deep learning methods have laid a solid foundation for spatial data processing, the past few years have witnessed a significant paradigm shift toward more complex architectures, particularly Graph Neural Networks (GNNs) and Transformers [20]. Transformers, driven by their robust self-attention mechanisms, have been increasingly adapted to extract global geometric features directly from vector lines and polygons [21]. Furthermore, hybrid architectures, such as Conv-Transformers, have been proposed to successfully combine local convolutional operations with global context modeling for vector polygon analysis and change detection [22]. Concurrently, GNNs have demonstrated exceptional capability in handling non-Euclidean spatial data, explicitly capturing complex topological relationships and matching patterns across multi-scale vector spatial lines [23].

However, despite the powerful global semantic extraction capabilities of GNNs and Transformers, these architectures often introduce substantial computational overhead and structural complexity. For the specific task of compressing island boundary vector data—which inherently consists of continuous, sequentially ordered coordinate points—the primary objectives are to minimize direct positional deviation (PD) and maintain high-fidelity macroscopic geometric shapes. In this context, a targeted convolutional autoencoder (CAE) remains highly advantageous. By leveraging one-dimensional convolutions, the proposed CAE efficiently exploits the inherent linear sequence and local spatial correlations of the boundary curves. This approach achieves precise sequence-to-sequence reconstruction and optimal compression fidelity without the redundant complexity associated with complete graph construction or full-attention modeling.

This study proposes a 1D CNN-based lossy compression framework specifically tailored to vector curve data. A key contribution of this work is the explicit demonstration that 1D convolutional architectures are more effective at capturing the local geometric structures essential for compression than fully connected autoencoders (FCAs) and Fourier-based approaches by efficiently capturing spatially localized geometric features. While validated using island boundaries as a representative case, this framework is inherently applicable to other 2D vector polylines, such as contours and hydrological networks, which share similar sequential characteristics. While Fourier methods prioritize global outlines and FCAs lack spatial locality mechanisms, our architecture leverages 1D convolutional layers to efficiently learn spatially localized geometric features, leading to more compact latent representations and higher reconstruction accuracy. We validate this framework using island boundary data as a representative case of polylines. Experimental results confirm that by preserving critical geometric metrics such as positional deviation (PD), relative area error (RAE) and relative perimeter error (RPE), the proposed method achieves superior compression performance, providing a robust and efficient solution for storage-oriented vector data management in GISs.

2. Model and Methods

To address storage-oriented compression of vector curve data with controlled geometric distortion, this study reformulates vector curve compression as a representation learning problem under explicit capacity constraints, rather than a scale-driven geometric simplification task. From a theoretical perspective, the core objective of vector curve compression is to construct a compact parametric representation that preserves essential geometric information while minimizing storage cost, subject to a bounded reconstruction error. In this sense, compression can be interpreted as an optimization problem that balances representational compactness and geometric fidelity.

Vector curves are inherently ordered one-dimensional geometric signals, whose information content is characterized by both local geometric variations and global structural properties. Traditional cartographic generalization and rule-based simplification methods reduce data size by applying explicit geometric heuristics or scale-dependent thresholds, which implicitly determine compression outcomes and offer limited control over the trade-off between storage efficiency and reconstruction accuracy. In contrast, a representation learning perspective enables this trade-off to be explicitly optimized by constraining the capacity of a learned latent space and minimizing reconstruction distortion in a data-driven manner. Within this formulation, vector curve compression is naturally cast into an encoding–decoding paradigm, where the encoder learns a compact latent representation that captures the intrinsic geometric structure of the curve, and the decoder reconstructs the curve from this representation. The dimensionality and information capacity of the latent space directly correspond to the achievable compression ratio, thereby providing an explicit and interpretable mechanism for storage-oriented compression. Based on the geometric characteristics of vector curves, a hierarchical representation strategy is adopted. Local geometric continuity and fine-scale variations along the curve are modeled through convolutional operations applied to unify curve sequences, while global shape information and long-range dependencies are captured through fully connected layers that perform holistic aggregation and compact encoding. By operating directly on vector representations without rasterization, the proposed framework preserves geometric precision and is particularly suited for high-fidelity compression of complex geographic curves.

2.1. Problem Formulation and Curve Representation

Vector curve data in GISs are typically represented as ordered sequences of two-dimensional coordinates that describe the geometric shape of linear or boundary features. Given a vector curve C =

{(x_{1}, y_{1}), (x_{2}, y_{2}) \dots, (x_{n}, y_{n})}

, where the vertices are ordered along the curve, the objective of vector curve compression is to reduce the amount of data required to store and transmit C while preserving its essential geometric characteristics. In storage-oriented compression scenarios, the goal differs fundamentally from scale-driven cartographic generalization. Rather than selectively removing vertices based on perceptual or scale-dependent thresholds, storage-oriented compression seeks to encode the original curve into a compact representation z such that the reconstructed curve

\bar{C}

can be faithfully recovered from z under controlled geometric distortion. This process can be formulated as an encoding–decoding problem, where an encoder function

f (\cdot)

maps the original curve C to a latent representation z =

f (C)

, and a decoder function

g (\cdot)

reconstructs the curve as

\bar{C} = g (z)

.

A key challenge in learning-based vector curve compression lies in the variable length and non-uniform sampling of vector curves. Geographic curves often exhibit irregular vertex spacing and highly variable geometric complexity, which makes it difficult to directly apply neural network models that require fixed-length inputs. To address this issue, the proposed framework adopts a unified curve representation through segmentation and resampling. Specifically, in the context of vector data, this point-level resampling utilizes an equidistant interpolation method to redistribute a fixed number of coordinate vertices evenly along the geometric path of each curve segment. This process transforms curves consisting of an arbitrary number of points into the consistent, fixed-dimensional vector representations required as the neural network inputs, while rigorously preserving their original geometric continuity and spatial characteristics. This enables the learning of compact encodings across curves with diverse shapes, point densities, and lengths.

To enhance the modeling of geometric structure, vector curves are represented using coordinate differences rather than absolute coordinates. Specifically, the curve is expressed as a sequence of incremental displacements between consecutive vertices, which emphasizes local geometric variation and reduces sensitivity to absolute position. This representation facilitates the learning of translation-invariant geometric patterns and supports efficient reconstruction by cumulatively restoring vertex positions from the decoded displacement sequence. Based on this formulation, vector curve compression is treated as a representation learning problem in which the encoder learns a compact latent representation that captures both local geometric details and global curve structure. The reconstruction quality is evaluated by comparing the original curve C and the reconstructed curve

\bar{C}

using the geometric error measures defined in Section 3.3. This formulation provides a unified foundation for the convolutional autoencoder-based compression framework described in the following sections.

2.2. Data Preprocessing

Vector spatial line data are commonly represented as ordered sequences of Cartesian coordinates, expressed as

{(x_{1}, y_{1}), (x_{2}, y_{2}) \dots \dots}

, where each point records its spatial location along the curve. While absolute coordinates describe positional information, the geometric characteristics of a curve—such as direction changes and local shape variations—are more effectively captured by the differences between adjacent vertices. Therefore, the present study represents curve geometry using coordinate increments, i.e., incremental changes in Cartesian coordinates between consecutive vertices. Specifically, for each curve segment, the coordinate differences

(d x_{i}, d y_{i})

are computed between adjacent vertices. By recording the absolute coordinates of the first point and cumulatively summing the coordinate increments, the original curve geometry can be fully reconstructed. This representation emphasizes local geometric variation while reducing sensitivity to absolute position, making it well suited for learning-based compression. Based on this principle, coordinate increments are employed as feature vectors and input into the convolutional autoencoder model for training. Neural network models typically require inputs of fixed length and consistent structure. However, island boundary data exhibit substantial variability in curve length, shape complexity, and point distribution. To address these challenges and ensure data consistency, a preprocessing pipeline consisting of curve segmentation, curve resampling, and data normalization is applied, as described below.

From a representation learning perspective, this differential encoding transforms vector curves into locally structured geometric signals, enabling the model to focus on intrinsic shape characteristics rather than absolute spatial placement. This formulation establishes a geometric foundation for subsequent hierarchical feature learning, where local variations are first modeled and later integrated into higher-level representations.

2.2.1. Curve Segmentation

In the context of geometric representation learning, segmentation serves not only as a means of input unification, but also as a mechanism to control the spatial scope over which local geometric patterns are learned. By decomposing long and heterogeneous curves into segments of comparable length, the model is encouraged to capture consistent local geometric variations within a bounded context, which facilitates stable convolutional feature extraction. Island boundary curves vary significantly in total length, and excessive variation in segment length can introduce large discrepancies in geometric complexity, thereby increasing the difficulty of neural network training. To reduce this effect, each boundary curve with total length

L

is divided into multiple sub-segments with a target length

l

. In general, if

L > l

, the curve is initially divided into

k = [L / l]

sub-segments of length

l

, leaving a remaining portion of length

e = L - k * l

. In this study, vector curves are represented as polylines composed of ordered vertices connected by straight line segments, which is the uniform format of most GIS vector line datasets. Consequently, curve length is computed as the cumulative Euclidean distance between consecutive vertices. Since the final sub-segment may not meet the target length, two cases are considered to handle the remainder:

If $e < \frac{2}{3} l$ , the remaining portion is sufficiently short and is merged with the k-th segment. In this case, the adjusted length of the k-th segment becomes $l_{k} = l + e$ .
If $e \geq \frac{2}{3} l$ , the remaining portion is retained as an additional segment, denoted as the $(k + 1)$ -th segment, with length $l_{k + 1} = e$ .

This segmentation strategy ensures that segment lengths remain relatively consistent while avoiding excessively short segments, thereby preserving the geometric characteristics of the original boundary curve.

It is acknowledged that fixed-length segmentation may potentially weaken the explicit modeling of long-range dependencies and global outline continuity across entire boundary curves. However, in the proposed framework, this limitation is mitigated at subsequent representation stages, where segment-level features are aggregated through global encoding layers to reconstruct holistic curve structures. As a result, segmentation functions as a controlled decomposition for local geometric learning, rather than a loss of global shape information. Regarding the segmentation strategy, it is important to note that the chosen segment length (and the subsequent equidistant resampling to a fixed number of vertices) is designed to be scale-invariant. Rather than relying on absolute geographic distances—which would be highly dependent on the unknown cartographic scale of the original data—the segmentation size is determined by the necessity to capture sufficient local geometric context. Specifically, the segment length is calibrated to match the receptive field of the CAE, ensuring that each input tensor contains meaningful topographical variations while avoiding excessive structural complexity. This allows the network to learn pure shape features and localized curvatures effectively, independent of the absolute spatial scale.

2.2.2. Curve Resampling

After segmentation, individual curve segments may still differ in the number of vertices they contain. Since neural networks require inputs of uniform size, resampling is applied to unify point distributions across segments. In addition, uneven vertex spacing within a segment can adversely affect model training. Resampling serves to transform irregularly sampled vector curves into uniformly parameterized geometric signals. By enforcing a consistent sampling scheme, the model is guided to learn shape characteristics based on intrinsic curve geometry rather than artifacts introduced by uneven vertex distributions. For a curve segment with length

l_{i}

, the segment is resampled into

N

equal intervals with a sampling distance of

l_{i} / N

, producing

N + 1

uniformly distributed vertices. The coordinate differences between the consecutive resampled vertices are then computed, resulting in a sequence of

N

incremental vectors

{(d x}_{i}, {d y}_{i})

, where

i = 1,2, \dots, N

. This procedure yields a fixed-length representation that preserves the overall geometric structure of each segment while providing a consistent parameterization suitable for convolutional feature extraction. The segmentation and resampling process is illustrated in Figure 1.

2.2.3. Data Normalization

In addition to improving numerical stability during training, normalization plays an important role in reducing scale-related bias and enhancing the model’s ability to learn geometry-driven patterns rather than magnitude-dependent variations. To improve the efficiency and stability of neural network training, the coordinate increment features are normalized prior to model input. Specifically, the coordinate differences are scaled to the range [0, 1] using min–max normalization, as expressed in Equation (1):

\begin{matrix} △ x_{i} = \frac{d x_{i} - {d x}_{m i n}}{{d x}_{m a x} - {d x}_{m i n}} \\ △ y_{i} = \frac{d y_{i} - {d y}_{m i n}}{{d y}_{m a x} - {d y}_{m i n}} \end{matrix}}

(1)

where

{d x}_{m i n}

and

{d x}_{m a x}

denote the minimum and maximum values of the x-coordinate increments within a segment, and

{d y}_{m i n}

and

{d x}_{m a x}

denote the corresponding values for the y-coordinate increments. By ensuring consistent feature scaling across curve segments, normalization facilitates more balanced representation learning and improves the comparability of geometric patterns across different segments.

2.3. Convolutional Autoencoder-Based Compression Model

2.3.1. Convolutional Autoencoder

For vector curve data represented as ordered one-dimensional geometric sequences, effective compression requires modeling both local geometric variations and their spatial continuity along the curve. Convolutional neural networks [24] provide a natural mechanism for capturing such localized geometric patterns through shared-weight filtering over bounded neighborhoods. The general computational model of a CNN is expressed in Equation (2). In the context of coordinate-difference sequences, convolution operates as a localized geometric filter that aggregates directional and magnitude changes within a sliding window, enabling the network to learn patterns related to curvature, smoothness, and directional continuity.

y = h (X ⨂ W + b)

(2)

where

X

denotes the input,

W

denotes the convolution kernel,

b

denotes the bias,

h (\cdot)

denotes the activation function, and

y

the output. The symbol

⨂

denotes the convolution operation.

Through successive convolutional layers with appropriate strides, the encoder progressively aggregates local geometric information into more compact representations, effectively reducing representational dimensionality while preserving salient geometric structures.

Autoencoders provide an unsupervised learning framework for dimensionality reduction and data reconstruction and have been widely adopted in representation learning [25]. An autoencoder consists of two main components: an encoder and a decoder. The encoder maps high-dimensional input data—formulated as coordinate-difference sequences in our work—into a compact latent representation z, while the decoder reconstructs the input data from z. In the context of this study, the input is specifically represented as a high-dimensional sequence of coordinate differences. By minimizing the reconstruction error between the input and output, the autoencoder learns a low-dimensional representation that preserves the essential geometric characteristics of the original curve. In this formulation, the reconstruction error can be interpreted as a measure of geometric distortion, enabling the compression process to be optimized under an explicit fidelity constraint.

A convolutional autoencoder (CAE) extends this framework by employing convolutional layers in the encoder to exploit local spatial dependencies in the coordinate-difference sequences, and transposed convolutional layers in the decoder, to recover the original resolution. Compared with purely fully connected autoencoders, CAEs require fewer parameters and are more effective at preserving local geometric structures along the curve. However, convolutional layers alone are insufficient to perform global dimensionality reduction into a compact latent vector. Therefore, fully connected (FC) layers are introduced between the convolutional encoder and decoder to bridge this gap.

Specifically, the encoder applies one-dimensional convolutional layers to extract local geometric features from the input (

d x, d y)

sequence. The resulting feature maps are then flattened and passed through one or more fully connected layers, which perform global feature aggregation and dimensionality compression, producing the latent vector z. This FC-based bottleneck explicitly controls the dimensionality of the latent space and serves as the core mechanism for curve compression. In the decoder, a symmetric set of fully connected layers first expands z back into a high-dimensional feature representation, which is subsequently reshaped and fed into transposed convolutional layers to reconstruct the coordinate-difference sequence (

d \bar{x}, d \bar{y})

.

The overall architecture of the proposed convolutional autoencoder is illustrated in Figure 2. Coordinate differences (

d x, d y)

are first computed from the original curve and used as the network input. Convolutional layers extract local features, followed by fully connected layers that compress the representation into a low-dimensional latent vector z. The decoder mirrors this process by using fully connected layers to restore global structure and transposed convolutions to recover local details, yielding reconstructed differences (

d \bar{x}, d \bar{y})

with the same dimensionality as the input. Finally, the original curve coordinates are recovered by cumulatively summing the reconstructed differences. If the discrepancy between (

d \bar{x}, d \bar{y})

and (

d x, d y)

remains within an acceptable threshold, the latent vector z can be regarded as a valid compressed representation of the original curve segment.

Overall, the proposed convolutional autoencoder establishes a hierarchical geometric representation framework, where local geometric variations are first captured through convolutional encoding and subsequently consolidated into a compact global representation, enabling efficient and high-fidelity vector curve compression.

2.3.2. Data Compression with a Convolutional Autoencoder

Convolutional neural networks (CNNs) are particularly effective in capturing local structural patterns while maintaining parameter efficiency, making them well suited for modeling sequential geometric data. Depending on the dimensionality of the input, convolutional operations can be implemented in different forms. In the present study, the training samples consist of ordered sequences of coordinate increments derived from segmented and resampled curves, which naturally correspond to one-dimensional signals. Accordingly, a one-dimensional convolutional autoencoder (CAE) is adopted to perform vector curve compression. A key challenge in CAE-based compression lies in controlling the dimensionality of the latent representation using convolutional layers alone. To explicitly regulate the compression ratio, a fully connected layer is introduced after the final convolutional layer in the encoder to map high-dimensional feature maps into a compact latent vector. To maintain architectural symmetry, a corresponding fully connected layer is placed at the beginning of the decoder. The overall architecture of the proposed model is illustrated in Figure 3. The encoder consists of two one-dimensional convolutional layers followed by a fully connected layer, whereas the decoder mirrors this structure using a fully connected layer and two transposed convolutional layers. The encoder outputs a low-dimensional latent vector that serves as the compressed representation of the curve segment, while the decoder reconstructs the coordinate increment sequence from this latent encoding. Since the input features are normalized to the range [0, 1], the Rectified Linear Unit (ReLU) activation function is employed in all hidden layers. Reconstruction accuracy is measured using the root mean square error (RMSE), which is adopted as the loss function during training. The proposed CAE operates on generic curve representations derived from coordinate sequences and does not depend on semantic attributes of specific geographic objects, making it applicable to a wide range of vector line data.

To ensure sufficient overlap between convolutional receptive fields while preserving feature continuity, the stride of both the convolutional and transposed convolutional layers is set to half of the kernel size. Empirical experiments demonstrate that a convolution kernel size of 1 × 7 yields optimal performance (Section 5.1), while a segment length of 25 km provides the best trade-off between geometric consistency and compression efficiency (Section 5.2). The first convolutional layer contains 16 channels and the second 32 channels, with the decoder adopting a symmetric configuration. The input feature vector has a fixed length of 200, which is compressed into a latent vector of dimension n, where n can be adjusted to achieve different compression ratios.

After training, the compressed representation of a curve dataset consists of four components: (1) the latent vectors produced by the encoder, (2) the starting coordinates of each curve segment, (3) the four normalization parameters

{d x}_{m i n}, {d x}_{m a x}

,

{d y}_{m i n}

,

{d y}_{m a x}

for inverse transformation, and (4) the trained model parameters (weights and biases). The model parameters are shared across all curve segments and are therefore stored only once. Although they formally belong to the compressed dataset, their contribution to the overall storage cost becomes negligible when the number of segments is sufficiently large. In practical experiments, the model size ranges from several tens to a few hundreds of kilobytes, which is minor compared with the size of the original vector data. The compression ratio (CR) is defined as the ratio between the size of the compressed data and that of the original curve data, as shown in Equation (3):

C R = \frac{2 P M}{(M N + 6) + W} \approx \frac{2 P}{N}

(3)

where

C R

denotes the compression ratio,

M

is the number of curve segments,

P

is the average number of sampled vertices per segment,

N

is the dimensionality of the latent vector, the constant 6 in the denominator represents the overhead of storing starting coordinates (2 values) and normalization metadata (4 values) for each segment and

W

represents the total number of model parameters. When M is sufficiently large, the influence of W can be ignored, leading to the simplified expression shown in the second part of Equation (3).

2.3.3. Restoration of Island Boundary Data

Curve reconstruction begins by reversing the normalization applied during preprocessing, thereby restoring the coordinate increments to their original numerical ranges. The inverse normalization is defined in Equation (4):

\begin{matrix} d x_{i} = (△ x_{i} + {△ x}_{m i n}) ({△ x}_{m a x} - {△ x}_{m i n}) \\ d y_{i} = (△ y_{i} + {△ y}_{m i n}) ({△ y}_{m a x} - {△ y}_{m i n}) \end{matrix}}

(4)

where

{△ x}_{m i n}

and

{△ x}_{m a x}

denote the minimum and maximum normalized x-coordinate increments within a curve segment, and

{△ y}_{m i n}

and

{△ y}_{m a x}

represent the corresponding values for the y-coordinate increments.

Due to reconstruction errors introduced during encoding and decoding, the cumulative endpoint of a reconstructed segment may not coincide exactly with that of the original curve, leading to boundary non-closure. This issue is particularly critical for island boundaries, which must satisfy strict geometric closure constraints. To address this problem, a closure error correction procedure is applied. Let

d x_{i}, d y_{i}

denote the original coordinate increments and

d {\bar{x}}_{i}, d {\bar{y}}_{i}

the reconstructed increments. The cumulative closure errors in the x- and y-directions are computed as:

f_{x} = \sum_{i = 1}^{N} (d x_{i} - d {\bar{x}}_{i})

,

f_{y} = \sum_{i = 1}^{N} (d y_{i} - d {\bar{y}}_{i})

, To enforce closure, these cumulative errors are evenly distributed across all increments, yielding correction terms:

v_{x} = - \frac{f_{x}}{N}, v_{y} = - \frac{f_{y}}{N}

The corrected increments are then obtained:

d {\overset{̿}{x}}_{i} = d {\bar{x}}_{i} + v_{x}, d {\overset{̿}{y}}_{i} = d {\bar{y}}_{i} + v_{y}

. Finally, the corrected coordinate increments are scaled by the corresponding segment length

l_{i}

, and the absolute coordinates of each point are reconstructed sequentially starting from the stored initial coordinate of the segment.

Together, data preprocessing, convolutional autoencoder-based compression, and geometric restoration form a complete compression–decompression framework. Within this framework, the encoder functions as a compression module that transforms long coordinate sequences into compact latent vectors, which can be stored as independent compressed files. The decoder serves as a decompression module that reconstructs vector spatial data from these encodings. Although the encoder and decoder are jointly trained, they can be deployed independently in practical applications, enabling efficient storage of island boundary data and reliable reconstruction when needed.

3. Experimental Setup

3.1. Data Sources and Preprocessing

To evaluate the proposed framework, island boundaries from Petal Maps [26] were selected as the primary dataset. These boundaries are characterized by irregular shapes, high vertex density, and closed structures, making them a representative testbed for vector curve compression.

Dataset statistics: The dataset includes 98 global islands with a cumulative boundary length of 402.78 km (Figure 4).

Preprocessing: Following the procedure in Section 2.2, the raw vector lines were segmented, resampled, and normalized. This yielded 16,123 curve segments, each represented by a coordinate increment feature vector of length 200.

3.2. Implementation and Training Settings

The convolutional autoencoder (CAE) was implemented using the PyTorch (version 2.7.1) [27] framework. All models were trained and evaluated on a local workstation equipped with an Intel Core i7-10700 CPU (2.90 GHz) and an NVIDIA GeForce GTX 1650 GPU. It should be noted that when grouping the data, the island-like boundaries are first broken down into several segments, which are then randomly divided into a training set (70%) and a test set (30%). This segment-based sampling ensures that fragments from the same original island-like structure are distributed across both datasets, thereby enabling the model to be trained and evaluated on a full sampling collection of local geometric patterns.

3.3. Evaluation Metrics

Data compression inevitably introduces geometric distortion into spatial data. In this study, reconstruction accuracy is evaluated using the mean positional deviation (PD) between the original and reconstructed curve vertices, which directly quantifies the spatial displacement induced by the compression–decompression process. Larger deviation values indicate greater geometric distortion, whereas smaller values correspond to higher reconstruction fidelity. The mean positional deviation is defined as the average Euclidean distance between corresponding vertices on the original and reconstructed curves, as expressed in Equation (5); it serves as both the primary training objective and the core accuracy metric.

P D = \frac{\sum_{i = 1}^{n} \sqrt{{(x_{i} - {\bar{x}}_{i})}^{2} + {(y_{i} - {\bar{y}}_{i})}^{2}}}{n}

(5)

where

D

denotes the mean positional deviation error,

n

is the total number of vertices,

x_{i}

,

y_{i}

denote the coordinates of the original vertices, and

{\bar{x}}_{i}

,

{\bar{y}}_{i}

denote the reconstructed coordinates.

In addition to the primary PD metric, we introduce a multifaceted evaluation framework to comprehensively assess the quality of reconstructed curves from different geometric perspectives:

Geometric errors (RAE, RPE): To evaluate the preservation of the curves’ holistic geometric properties, we employ relative area error (RAE) and relative perimeter error (RPE), as expressed in Equation (6). It should be noted that since individual curve segments cannot enclose an area, these metrics are calculated only after merging the reconstructed segments back into complete, closed island polygons.

\begin{matrix} R A E = \frac{\sum_{i = 1}^{n} | A_{i} - {\bar{A}}_{i} |}{n A_{i}} \times 100 % \\ R P E = \frac{\sum_{i = 1}^{n} | P_{i} - {\bar{P}}_{i} |}{n P_{i}} \times 100 % \end{matrix}}

(6)

where n denotes the total number of curve segments;

A_{i}

and

P_{i}

denote the original area and perimeter of the i-th original closed island, respectively; and

{\bar{A}}_{i}

and

{\bar{P}}_{i}

represent the area and perimeter of the i-th reconstructed closed island.

Intersection over union (IoU): To quantify the spatial overlap and positional alignment between the original and reconstructed curves, the IoU metric is utilized. By calculating the ratio of the intersection area to the union area of the buffered curve regions, IoU provides a normalized score (ranging from 0 to 1) that represents the overall spatial fidelity, as expressed in Equation (7). A higher IoU signifies a more accurate spatial coincidence.

I o U = \frac{\sum_{i = 1}^{n} (A_{i} ⋂ {\bar{A}}_{i})}{n (A_{I} ⋃ {\bar{A}}_{I})}

(7)

Curvature change (CC): To assess the preservation of morphological and semantic features, we define the curvature change (CC) metric. It evaluates the fidelity of the curve’s “shape signature” by quantifying deviations in its bending patterns, as expressed in Equations (8) and (9). For a discrete curve consisting of N vertices, let

p_{i} = (x_{i}, y_{i})

be the i-th vertex. The discrete curvature

κ_{i}

at vertex

p_{i}

is approximated using its adjacent vertices

p_{i - 1}

and

p_{i + 1}

:

κ_{i} = \frac{2 | v_{1} v_{2} |}{‖ v_{1} ‖ ‖ v_{2} ‖ (‖ v_{1} ‖ + ‖ v_{2} ‖)}

(8)

where

v_{1} = p_{i} - p_{i - 1}

and

v_{2} = p_{i + 1} - p_{i}

are the incoming and outgoing vectors at

p_{i}

, respectively, and

| v_{1} * v_{2} |

denotes the magnitude of their cross product. The overall curvature change metric is then computed as the mean absolute difference between the curvature sequences of the original and reconstructed curves:

C C = \frac{\sum_{i = 1}^{n} | κ_{i} - {\bar{κ}}_{i} |}{n}

(9)

where

κ_{i}

and

{\bar{κ}}_{i}

represent the discrete curvature at the i-th vertex of the original and reconstructed curves, respectively. Maintaining a low CC is essential for ensuring that the reconstructed curve remains visually and structurally consistent with the original, particularly for geographic features with complex sinuosity.

In practice, an appropriate balance must be struck between compression ratio and reconstruction accuracy. In cartographic design and map visualization, feature discriminability is constrained by both human visual acuity and practical mapping conditions. Technical reports in cartography indicate that, at a typical viewing distance of approximately 30 cm, map features smaller than about 0.2 mm are generally difficult to distinguish reliably, and this value has been widely adopted as an empirical lower bound for symbol size and feature spacing [28]. This study adopts 0.2 mm as the minimum discernible unit on the map and converts it into the corresponding ground distance through scale mapping to constrain the spatial accuracy of curve data reconstruction. This threshold represents an engineering-scale limitation derived from visual discriminability rather than a theoretical upper bound of measurement accuracy and is therefore more appropriate for ensuring visual consistency in multi-scale spatial data reconstruction processes.

3.4. Baseline Methods

In this study, a convolutional autoencoder (CAE) is proposed for the compression of island boundary vector data. The encoder extracts geometric features by transforming sequences of coordinate increments into compact latent vectors, thereby achieving effective data reduction, while the decoder reconstructs the original curves from these latent representations when needed.

To comprehensively evaluate the performance of the proposed convolutional autoencoder (CAE), three representative baseline methods were selected for comparison: the Douglas–Peucker (DP) algorithm, a Fourier series-based method, and a fully connected autoencoder (FCA). These methods represent traditional geometric simplification, classical function-fitting, and an alternative deep learning architecture.

To ensure a fair and objective comparison, all baseline methods and the proposed CAE were applied to identical preprocessed datasets under consistent experimental conditions. The reconstruction accuracy and geometric fidelity across different compression ratios were quantitatively assessed using the multifaceted evaluation framework defined in Section 3.3 (i.e., PD, RAE, RPE, IoU, and CC).

3.4.1. Douglas–Peucker Algorithm

As a classical baseline for vector spatial data simplification, the Douglas–Peucker (DP) algorithm is widely recognized for its balance between computational efficiency and geometric fidelity. The DP algorithm operates by recursively identifying the vertex with the maximum perpendicular distance to the line segment connecting the first and last points of a polyline. If this maximum distance exceeds a predefined tolerance threshold

ϵ

, the point is retained, and the polyline is split at this vertex to repeat the process; otherwise, all intermediate vertices are discarded. This mechanism effectively preserves the critical shape features of island boundaries, such as prominent capes and bays, while eliminating redundant collinear points.

To quantitatively evaluate the compression performance of the algorithms, the compression ratio (CR) is defined as the ratio of the data size before and after compression. In the context of 2D polyline vector data, where each vertex occupies a fixed storage size, CR is formulated as

C R = \frac{N_{i n p u t}}{N_{o u t p u t}}

(10)

where

N_{i n p u t}

denotes the total number of vertices in the original vector spatial data, and

N_{o u t p u t}

represents the number of vertices retained in the simplified output. A higher

C R

indicates a greater reduction in data volume. In our comparative experiments, the tolerance

ϵ

of the DP algorithm is dynamically adjusted to align with specific target CRs, ensuring a fair comparison of geometric distortions across different methods under identical storage constraints.

3.4.2. Fourier Series-Based Curve Compression

Island boundary line data are typically represented in the spatial domain as ordered sets of vertices. Fourier series expansion provides a classical transformation from the spatial domain to the frequency domain by representing a periodic function as a weighted sum of sine and cosine components with different frequencies. Under this framework, the spatial coordinates of vector line data can be interpreted as one-dimensional signals along the curve length, and the original curve geometry can be approximated by a truncated Fourier series, as expressed in Equation (10). A prerequisite for applying Fourier series approximation is that the target curve must be periodic. However, in practical experiments, the segmented island boundary lines are open curves and therefore non-periodic. To satisfy the periodicity requirement, a mirroring strategy was adopted. Specifically, each segmented arc was symmetrically mirrored with respect to its end vertices, and the mirrored curve was connected to the original segment, forming a closed curve. This procedure preserves the local geometric characteristics of the original segment while enabling Fourier series representation.

\begin{matrix} X (s) \approx \frac{A_{0}^{X}}{2} + \sum_{n = 1}^{N} (A_{n}^{X} c o s \frac{2 n π s}{L} + B_{n}^{X} s i n \frac{2 n π s}{L}) \\ Y (s) \approx \frac{A_{0}^{Y}}{2} + \sum_{n = 1}^{N} (A_{n}^{Y} c o s \frac{2 n π s}{L} + B_{n}^{Y} s i n \frac{2 n π s}{L}) \end{matrix}}

(11)

where

A_{n}^{X}, B_{n}^{X}, A_{n}^{Y} and B_{n}^{Y}

denote the Fourier coefficients for the x- and y-coordinates, respectively;

L

denotes the period of the arc segment;

s

denotes the curve-length parameter; and

N

denotes the number of retained Fourier expansion terms. As

N \to + \infty

, the Fourier series can theoretically reconstruct the original curve without loss. In practice, a larger

N

yields stronger compression at the cost of geometric fidelity.

In the Fourier series-based curve compression method, the compression ratio

(C R)

is controlled by the number of retained expansion terms

N

. A smaller

N

corresponds to a lower compression ratio and thus a higher degree of data reduction. Assuming that the original curve contains M sampled vertices, the compression ratio

(C R)

can be defined as shown in Equation (11).

C R = \frac{2 M}{4 N + 2} \approx \frac{M}{2 N}

(12)

3.4.3. Fully Connected Autoencoder-Based Compression

Similarly to the convolutional autoencoder, the fully connected autoencoder (FCA) adopts coordinate increment sequences as input and performs curve compression by learning a nonlinear mapping between the input space and a low-dimensional latent space. In this model, the encoder and decoder are constructed using multilayer perceptrons (MLPs), in which all neurons in adjacent layers are fully connected. Owing to the global connectivity of fully connected layers, the FCAE is capable of capturing overall structural patterns of the input sequence.

Neural networks are typically organized as multilayer architectures consisting of an input layer, multiple hidden layers, and an output layer. Each neuron receives weighted inputs from the preceding layer, applies a nonlinear activation function, and propagates the transformed signal forward. While the representational capacity of a single neuron is limited, a sufficiently deep and wide network can approximate complex nonlinear functions. In the fully connected autoencoder framework, the encoder progressively maps the coordinate increment sequence into a compact latent vector through a series of fully connected hidden layers, and the decoder reconstructs the original sequence from this latent representation in a symmetric manner.

Compared with convolutional architectures, the fully connected autoencoder does not employ local receptive fields or parameter sharing. As a result, it generally involves a larger number of trainable parameters and higher computational cost for inputs of the same dimensionality. Moreover, because spatial locality is not explicitly modeled, local geometric continuity along the curve may be less effectively preserved than in convolution-based models. Nevertheless, the fully connected autoencoder provides a meaningful baseline for evaluating the advantages of convolutional structures in vector curve compression. For a fair comparison with the convolutional autoencoder, the compression ratio of the fully connected autoencoder is defined in the same manner as in Equation (3), based on the dimensionality of the latent vector relative to the input sequence length.

4. Experimental Results

4.1. Model Accuracy

To illustrate model performance, a compression ratio of 3.33 was selected as an example. Figure 5 shows the variation in model loss and mean point displacement under this compression ratio. The X-axis represents the number of training epochs, where each epoch corresponds to one complete forward and backward propagation over the training dataset. The right Y-axis represents model loss. The results show that the loss decreases rapidly within the first 20 epochs, followed by oscillations around a stable value, indicating overall convergence. Training and testing losses remain close, suggesting that the model does not suffer from overfitting. The left Y-axis represents the mean point displacement, measured in meters, which reflects the positional accuracy of the model in practical applications. Similarly to the loss curve, the mean point displacement decreases sharply during the first 20 epochs and then gradually declines with minor fluctuations between epochs 20 and 200.

4.2. Impact of Compression Ratios

After training, the learned model parameters were saved and used to reconstruct island boundary segments for quantitative accuracy evaluation. To investigate the relationship between compression ratio and reconstruction accuracy, the dimensionality of the latent feature vector was set to 100, 80, 60, 40, and 20, corresponding to compression ratios of 2, 4, 3.33, 5, and 10, respectively. Here, the compression ratio is defined as the ratio between the latent vector length and the original input vector length. For each compression setting, the decoder was applied to reconstruct the compressed features, and the mean positional deviation was computed. The reconstruction results obtained using the best-performing model parameters are summarized in Table 1.

As shown in Table 1, when the compression ratio is set to 2, 2.5, or 3.3, the mean positional deviation remains within the range of approximately 40–50 m. In contrast, further increasing the compression level leads to a noticeable degradation in reconstruction accuracy. At a compression ratio of 5, the mean displacement increases to 62.107 m, and at compression ratio 10, it rises sharply to 117.831 m. These results indicate that excessive reduction of the latent feature dimension significantly weakens the model’s ability to preserve fine-scale geometric details, resulting in increased positional distortion.

To provide a qualitative assessment of reconstruction performance, Figure 6 presents the reconstructed island boundary curves at a map scale of 1:100,000 under different compression ratios. At a compression ratio of 2, the reconstructed curves exhibit only minor deviations from the original boundaries, which are visually negligible. At a 3.33 compression ratio, local deviations become more apparent, yet the overall curve shape remains visually consistent with the original. In contrast, at a compression ratio of 10, pronounced geometric distortions can be observed, further confirming the decline in reconstruction fidelity at high compression levels. These visual results are consistent with the quantitative accuracy analysis and demonstrate the trade-off between compression efficiency and geometric accuracy.

4.3. Comparative Results and Discussion

To evaluate the performance of different curve compression strategies, the part of the island boundary dataset described in Section 3.1 was compressed using four methods: the proposed convolutional autoencoder (CAE), the Fourier series-based method (FS), the fully connected autoencoder (FCA), and the classical Douglas–Peucker (DP) algorithm. Reconstruction accuracy under varying compression ratios is quantitatively compared in Figure 7 using the mean positional deviation metric defined in Section 3.3.

The results indicate that at lower compression levels (CR = 2 and 2.5), both the Fourier series method and the DP algorithm achieve slightly lower mean positional deviation than the convolutional autoencoder. For the FS method, this behavior can be attributed to the strong representation capability of low-frequency Fourier components for smooth and slowly varying curves; for the DP algorithm, retaining a relatively large subset of original vertices naturally preserves the geometric outline with high fidelity. However, as the compression ratio increases (CR = 3.33 and above), the reconstruction accuracy of both the FS and DP methods deteriorates rapidly. The DP algorithm, in particular, exhibits a dramatic exponential increase in positional deviation, reaching nearly 400 m at CR = 10, making it the least effective method under severe compression. This severe degradation occurs because DP relies on point decimation; at extreme compression ratios, discarding the vast majority of vertices forces the algorithm to bridge large gaps with long straight-line segments, leading to significant geometric distortion in non-feature areas. In contrast, the convolutional autoencoder maintains relatively stable performance. Under these higher-compression scenarios, the CAE consistently outperforms the other three approaches, demonstrating superior robustness and global shape awareness. Across all tested compression ratios, the CAE also achieves lower reconstruction errors than the FCA, confirming the effectiveness of local convolution operations in extracting spatial curve features.

Figure 8 provides a visual comparison of reconstructed island boundary curves at compression ratios of 2, 3.33, and 10, displayed at a map scale of 1:50,000. Each row compares the four methods under the same compression ratio, while each column illustrates the effect of different compression levels for the same geographic region. The original curve is shown in black, while reconstructions generated by the convolutional autoencoder (CAE), Fourier series method (FS), fully connected autoencoder (FCA), and the Douglas–Peucker (DP) algorithm are shown in red, green, blue, and yellow, respectively. To highlight structural differences, four representative local geometric features—a prominent cape with rapid directional change (a), a smoothly varying coastline (b), and a deep concave bay (c)—were selected for visualization.

At a lower compression ratio of 2, all four methods are able to reconstruct island boundaries with relatively small deviations from the original curves, and visual differences are generally subtle. The DP algorithm, in particular, tightly hugs the original boundary since a sufficient number of critical vertices are retained. However, as the compression ratio increases to 3.33 and further to 10, reconstruction errors become increasingly pronounced, revealing the distinct characteristics of each compression strategy.

Under high compression (CR = 10), the convolutional autoencoder (CAE) consistently produces reconstructed curves that remain closest to the original geometry across all curve types. This advantage is evident not only for flat segments but also for concave and convex structures, which contain more complex local geometric variations. In stark contrast, the DP algorithm exhibits severe geometric degradation. Because DP relies strictly on point decimation and linear interpolation, forcing it into a high compression ratio results in aggressive pruning of essential shape descriptors. This is most prominently illustrated in Figure 8c, where the DP algorithm completely loses the geometric outline of the deep bay by bridging it with a single straight line, and in Figure 8a, where it abruptly truncates the cape.

The fully connected autoencoder (FCA) also exhibits larger deviations, especially in regions with rapid directional changes, suggesting limited effectiveness in preserving local geometric continuity. Notably, at lower compression ratios (e.g., CR = 2), the Fourier series (FS) method remains highly competitive. This is likely because the global, low-frequency shape of many island segments can be effectively captured by the first few Fourier coefficients, whereas the CAE might be learning more complex and potentially redundant features at these lower compression levels. However, as the compression becomes more aggressive (CR = 10), the FS method shows substantial degradation. When reconstructing concave and convex shapes, the higher-frequency components required to represent local curvature are progressively discarded in the Fourier approach, leading to an over-smoothed, inward-shrinking effect seen in panels (a) and (c).

These visual observations consistently validate the quantitative metrics presented earlier, highlighting the superiority of the convolutional autoencoder in preserving complex, high-frequency geographic details through localized convolutional operations and shared parameters, making it highly robust for extreme compression of intricate vector curves.

5. Analysis and Discussion

5.1. Geometric Scale Sensitivity of the Convolutional Autoencoder

In convolutional neural networks (CNNs), overall performance is jointly determined by network architecture, layer configuration, and hyperparameter selection. Among these components, the convolutional kernel plays a central role in feature extraction by defining the size of the local receptive field. Through element-wise multiplication and summation with the input signal, convolutional kernels capture localized patterns and their sliding operation aggregates such local features into higher-level representations. For one-dimensional sequential data, such as the coordinate increment sequences used in this study, kernel size determines the geometric scale at which local variations along the curve are perceived by the model. From a geographic perspective, this process can be interpreted as analogous to the progressive scanning and abstraction of spatial features in cartographic generalization, where local geometric details are perceived within a limited contextual window. If the receptive field is too small, the kernel may fail to capture meaningful shape patterns beyond minor point-to-point fluctuations; if it is too large, fine-scale geometric variations may be overly smoothed or obscured. Therefore, an appropriate kernel size is essential for balancing sensitivity to local geometry and robustness to noise.

To examine the influence of kernel size on compression performance, a series of experiments were conducted using kernel sizes of {4, 7, 9, 11, 13} under a fixed compression ratio of 3.33. Reconstruction accuracy was evaluated using the mean positional deviation metric defined in Section 3.3. The results are summarized in Table 2. Among the tested configurations, a kernel size of 1 × 7 achieved the lowest mean positional deviation, indicating the most effective balance between local feature extraction and geometric continuity preservation under the current experimental setup.

It should be emphasized that this result does not imply a universally optimal kernel size for all types of vector curve data. Rather, the observed performance reflects an empirical alignment between the kernel receptive field and the characteristic geometric scale of the input curves. Specifically, a kernel size of 7 may correspond to capturing the typical length of a salient bend or geometric primitive in a resampled curve segment. For other vector line datasets with different geometric complexities, sampling densities, or segmentation strategies, the optimal kernel size may vary accordingly. Nonetheless, these results demonstrate that moderate kernel sizes, which capture short-range geometric dependencies without excessively expanding the receptive field, are generally well suited for learning compact representations of vector curve data.

5.2. Effect of Segment Length on Compression Stability and Accuracy

Segment length determines the geometric extent of curve information presented to the autoencoder at each training instance and therefore has a direct impact on representation learning and reconstruction accuracy. From a modeling perspective, segment length controls the balance between geometric completeness and structural variability: shorter segments emphasize fine-scale local details, whereas longer segments incorporate broader contextual shape information but also introduce greater morphological complexity. To investigate the influence of segment length on compression performance, four segment lengths 15 km, 25 km, 50 km, and 75 km—were evaluated under a fixed compression ratio of 3.33. Reconstruction accuracy was assessed using the mean positional deviation metric described in Section 3.3 The experimental results are summarized in Table 3. As shown in Table 3, reconstruction accuracy improves as the segment length increases from 15 km to 25 km, indicating that very short segments may not contain sufficient geometric context for the autoencoder to effectively learn characteristic curve patterns. When the segment length is further increased to 50 km and 75 km, however, reconstruction accuracy deteriorates noticeably. This decline suggests that overly long segments introduce excessive geometric variability, making it more difficult for the model to encode diverse shape patterns into a fixed-length latent representation without loss of detail.

These results indicate the existence of an intermediate segment length that provides an effective trade-off between geometric representativeness and model learnability. In the present experimental configuration, a segment length of approximately 25 km yields the lowest mean positional deviation. This distance likely represents the optimal spatial scale that encapsulates a complete and representative geometric feature without introducing excessive structural complexity that would overwhelm the autoencoder’s capacity. It should be noted that this value is not intended to serve as a universal optimal length. Instead, it reflects a dataset- and model-dependent balance between local geometric continuity and global shape complexity. For other types of vector curves or different sampling densities, the optimal segment length may vary accordingly. Nevertheless, the observed trend highlights the importance of aligning segment length with the representational capacity of the autoencoder when designing compression frameworks for vector curve data.

5.3. Compression Ratio Recommendations for Multi-Scale Island Visualization

Reconstruction of vector curves using the decoder inevitably introduces geometric deviations from the original data, and these deviations generally increase as the compression ratio rises. As demonstrated in Table 1, higher compression ratios lead to larger mean positional deviations. In practical cartographic applications, however, the acceptability of such deviations depends not only on their absolute magnitude in ground units but also on the target map scale and the limits of human visual perception.

In cartography and map visualization, it is widely recognized that positional deviations smaller than approximately 0.2 mm on a printed or displayed map are difficult for human observers to reliably perceive under typical viewing conditions. This empirical threshold is commonly adopted as a practical criterion for evaluating geometric accuracy in multi-scale mapping. By converting this map-space tolerance into ground-space distances for different map scales, acceptable deviation thresholds can be derived and used to guide compression ratio selection. Table 4 summarizes the recommended compression ratios for vector curve data at different map scales based on this visual discernibility criterion. For large-scale maps (1:100 K and finer), even relatively small positional deviations in ground space may exceed visual tolerance, indicating that aggressive compression is not suitable in such contexts. At medium-to-small scales, however, the allowable ground displacement increases substantially, making higher compression ratios feasible without introducing perceptible visual distortion. Specifically, for map scales around 1:250 K, a compression ratio of approximately 30% satisfies the visual accuracy requirement, while at 1:500 K the compression rate can be further increased to about 5. For small-scale representations such as 1:1 M, compression ratios as high as 10 remain visually acceptable. These results demonstrate that compression ratio selection should be scale-dependent and that substantial storage savings can be achieved at smaller map scales without compromising visual fidelity.

It should be emphasized that the recommended compression ratios reported here are derived from empirical experiments under the current model configuration and dataset characteristics. While the specific numerical values may vary for different vector curve types or modeling strategies, the underlying principle—that compression decisions should be jointly guided by reconstruction accuracy and target visualization scale—remains generally applicable. This scale-aware compression strategy provides a practical framework for integrating deep learning-based vector data compression into multi-scale geographic information systems.

5.4. Performance Comparison of Different Models

To comprehensively evaluate the effectiveness of the proposed 1D CAE-based compression framework, the model is compared against three baseline methods: a fully connected autoencoder (FCA), a traditional curve-fitting method based on Fourier Series (FS), and the classical Douglas–Peucker (DP) algorithm. To ensure a fair and representative comparison, the experimental results are evaluated at a specific compression ratio of 3.33 and summarized in Table 5.

Positional and spatial accuracy: The proposed CAE model demonstrates superior performance in spatial fidelity, achieving the lowest positional deviation (PD = 42.41) and the highest intersection over union (IoU = 0.9991). In contrast, the classical DP algorithm exhibits the highest positional deviation (PD = 81.98), as discarding a large number of vertices introduces significant point-to-line projection errors in non-feature areas. Furthermore, while the DP algorithm achieves the lowest relative area error (RAE = 0.0014%) by strictly retaining critical structural vertices that define the macro-boundary of the closed islands, the CAE remains highly competitive (RAE = 0.0067%) and significantly outperforms the other learning-based methods. This confirms that 1D convolutions effectively exploit the local spatial correlations of adjacent vertices, thereby preserving the overall spatial extent much better than the FCA, which lacks local spatial inductive bias, and the FS, which tends to over-smooth local details.

Morphological and perimeter features: Regarding morphological preservation, the traditional DP algorithm yields the lowest curvature change (CC = 4.60 × 10⁻⁵) and relative perimeter error (RPE = 0.7576%). This is because DP simplifies boundaries into straight-line segments between critical nodes, naturally minimizing complex local curvature variations and preserving the discrete polygonal perimeter. Among the remaining spectral and learning-based approaches, the traditional FS method performs well in preserving curve smoothness due to its inherent characteristic of fitting global outlines in the frequency domain, yielding a low curvature change (CC = 2.06 × 10⁻⁴) and relative perimeter error (RPE = 0.9538%). However, the proposed CAE remains highly competitive in these metrics (RPE = 0.9919%, CC = 3.22 × 10⁻⁴) and is significantly superior to the FCA. The FCA performs the worst in preserving geometric shape (RPE = 1.8345%, CC = 3.52 × 10⁻⁴), primarily because flattening the data for fully connected layers disrupts the spatial sequential continuity of the 1D coordinate sequence, leading to unnatural geometric jitters during reconstruction.

Computational efficiency: Regarding computational cost, traditional algorithmic and spectral methods like DP and FS do not require an amortized neural network (i.e., no training time is incurred), though they may incur different processing overheads during execution (e.g., complex frequency-domain transformations for FS). Among the deep learning models, simple matrix multiplications allow the FCA to be slightly faster (51.94 s) during training. However, the CAE (57.32 s) achieves a substantial improvement in reconstruction accuracy at a negligible computational trade-off. Overall, the results indicate that the proposed CAE achieves the optimal comprehensive trade-off between geometric fidelity, spatial accuracy, and computational efficiency for vector curve compression.

6. Conclusions and Future Work

This study proposes a convolutional autoencoder-based framework for the compression of vector curve data and demonstrates its effectiveness through systematic experiments on complex vector line data. By transforming coordinate sequences into coordinate-increment representations, the proposed encoder learns compact latent vectors that preserve essential geometric characteristics, while the decoder reconstructs curves with high positional fidelity. Experimental results confirm that the proposed approach achieves a favorable balance between compression efficiency and reconstruction accuracy. The trained encoder–decoder parameters are shared and distributed across a large number of curve segments and therefore do not scale with dataset size, making the proposed approach suitable for storage-oriented compression scenarios.

Comparative analyses with the classical Douglas–Peucker (DP) algorithm, Fourier series-based (FS) compression, and fully connected autoencoders (FCA) indicate that the CAE provides consistently superior performance across a wide range of compression ratios. While traditional heuristic methods like the DP algorithm achieve high fidelity at low compression levels by directly retaining critical vertices, their strict reliance on point decimation leads to severe geometric distortion and structural degradation under aggressive compression. On the other hand, while frequency-domain methods may achieve a slightly higher accuracy at relatively low compression levels, the proposed CAE exhibits clear advantages under moderate-to-high compression, where nonlinear local geometric features become more difficult to preserve using traditional parametric representations. Furthermore, by explicitly exploiting the sequential nature of boundary coordinates, the proposed architecture avoids the substantial computational overhead and structural complexity associated with Graph Neural Networks (GNNs) or Transformer-based models. The results further demonstrate that convolutional structures are more effective than fully connected architectures in capturing local geometric continuity in vector curve data.

Through a series of sensitivity analyses, this study highlights the importance of scale alignment between model design, data representation, and application requirements. The experiments reveal that both the receptive field of convolutional kernels and the geometric extent of curve segments significantly influence reconstruction performance, and that intermediate scales yield the most effective trade-offs between representational capacity and model stability. In addition, by linking reconstruction accuracy to cartographic visual tolerance, a scale-aware compression strategy is established, providing practical guidance for selecting compression ratios in multi-scale map visualization scenarios. The proposed method is particularly suitable for small- and medium-scale mapping applications, where substantial data reduction can be achieved without perceptible visual degradation.

Despite these promising results, several limitations remain. The experimental evaluation primarily focuses on a single category of vector curves (i.e., closed island boundaries), and the current framework emphasizes geometric reconstruction without explicitly modeling topological consistency or semantic attributes. Future work will extend the proposed approach to a broader range of vector datasets, including transportation networks and hydrographic features, and will investigate topology-aware constraints and multi-task learning strategies to further improve reconstruction robustness and generalization. Furthermore, integrating adaptive segmentation and scale-aware model configurations represents a promising direction for enhancing the flexibility of deep learning-based vector data compression in real-world geographic information systems.

Supplementary Materials

The dataset (preprocessed vector spatial data for islands and coastlines) and source code (PyTorch implementation of the CAE, including training scripts and evaluation metrics) are open-source and available at https://github.com/20zsnormal/Island-Curve-Compression-CAE/tree/main (accessed on 6 April 2026). Dataset: Preprocessed vector spatial data for islands and coastlines. Source code: PyTorch implementation of the CAE, including the training scripts and error evaluation metrics.

Author Contributions

Conceptualization, Pengcheng Liu and Shuo Zhang; methodology, Shuo Zhang; software, Shuo Zhang and Hongran Ma; validation, Shuo Zhang and Hongran Ma; formal analysis, Pengcheng Liu; investigation, Shuo Zhang, Pengcheng Liu and Hongran Ma; resources, Pengcheng Liu and Hongran Ma; data curation, Mingwu Guo; writing—original draft preparation, Shuo Zhang; writing—review and editing, Pengcheng Liu; visualization, Hongran Ma and Mingwu Guo; supervision, Pengcheng Liu; project administration, Pengcheng Liu; funding acquisition, Pengcheng Liu. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant numbers 42471486, 42071455) and the Fundamental Research Funds for the Central Universities (grant numbers CCNU25JC043, CCNU25KYZHSY22).

Data Availability Statement

The experiments utilized global vector map data from Huawei Petal Maps for model training. The experimental datasets and source code are available via the link provided in the Supplementary Materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Douglas, D.H.; Peucker, T.K. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica 1973, 10, 112–122. [Google Scholar] [CrossRef]
Yang, D.; Wang, J.; Lv, G. Study of realization method and improvement of Douglas–Peucker algorithm of vector data compressing. Bull. Surv. Mapp. 2002, 7, 18–19+22. [Google Scholar]
Li, C.; Luo, W.; Chen, G.; Yan, W. Discussion on the progressive improved algorithm for cartographic generalization of line features. Sci. Surv. Mapp. 2015, 40, 123–126. [Google Scholar]
Yu, J.; Chen, G.; Zhang, X. An improved Douglas–Peucker algorithm aimed at simplifying natural shoreline into direction-line. In Proceedings of the 21st International Conference on Geoinformatics; IEEE: Kaifeng, China, 2013; pp. 1–5. [Google Scholar]
Wu, F.; Deng, H. Using genetic algorithms for solving problems in automated line simplification. Acta Geod. Cartogr. Sin. 2003, 32, 349–355. [Google Scholar]
Li, Z.; Openshaw, S. Algorithms for automated line generalization based on a natural principle of objective generalization. Int. J. Geogr. Inf. Syst. 1992, 6, 373–389. [Google Scholar] [CrossRef]
Liu, P.; Ai, T.; Bi, X. Multi-scale representation model for contour based on Fourier series. Geom. Inf. Sci. Wuhan Univ. 2013, 38, 221–224. [Google Scholar]
Liu, P.; Li, X.; Liu, W.; Ai, T. Fourier-based multi-scale representation and progressive transmission of cartographic curves on the Internet. Cartogr. Geogr. Inf. Sci. 2015, 43, 454–468. [Google Scholar] [CrossRef]
Liu, P.; Xiao, T.; Xiao, J.; Ai, T. A head–tail information break method oriented to multi-scale representation of polyline. Acta Geod. Cartogr. Sin. 2020, 49, 921–933. [Google Scholar]
Liu, P.; Xiao, T.; Xiao, J.; Ai, T. A multi-scale representation model of polyline based on head/tail breaks. Int. J. Geogr. Inf. Sci. 2020, 34, 2275–2295. [Google Scholar] [CrossRef]
Wu, F.; Zhu, G. Multi-scale representation and automatic generalization of relief based on wavelet analysis. Geom. Inf. Sci. Wuhan Univ. 2001, 26, 170–176. [Google Scholar]
Wu, F. Scaleless representations for polyline spatial data based on wavelet analysis. Geom. Inf. Sci. Wuhan Univ. 2004, 29, 488–491. [Google Scholar]
Du, J.; Wu, F.; Yin, J.; Liu, C.; Gong, X. Polyline simplification based on the artificial neural network with constraints of generalization knowledge. Cartogr. Geogr. Inf. Sci. 2022, 49, 313–337. [Google Scholar] [CrossRef]
Jiang, B.; Xu, S.; Wu, Y.; Wang, M. Automatic vector polyline simplification based on region proposal network. Acta Geod. Cartogr. Sin. 2023, 52, 2209–2222. [Google Scholar]
Courtial, A.; El Ayedi, A.; Touya, G.; Zhang, X. Exploring the potential of deep learning segmentation for mountain roads generalisation. ISPRS Int. J. Geo-Inf. 2020, 9, 338. [Google Scholar] [CrossRef]
Du, J.; Wu, F.; Xing, R.; Gong, X.; Yu, L. Segmentation and sampling method for complex polyline generalization based on a generative adversarial network. Geocarto Int. 2021, 37, 4158–4180. [Google Scholar] [CrossRef]
Yu, W.; Chen, Y. Data-driven polyline simplification using a stacked autoencoder-based deep neural network. Trans. GIS 2022, 26, 2302–2325. [Google Scholar] [CrossRef]
Yan, X.; Yang, M. A deep learning approach for polyline and building simplification based on graph autoencoder with flexible constraints. Cartogr. Geogr. Inf. Sci. 2024, 51, 79–96. [Google Scholar] [CrossRef]
Liu, P.; Ma, H.; Zhou, Y.; Shao, Z. Autoencoder neural network method for curve data compression. Acta Geod. Cartogr. Sin. 2024, 53, 1634–1643. [Google Scholar]
Yan, X.; Yang, M.; Ai, T. Deep learning in automatic map generalization: Achievements and challenges. Geo-Spat. Inf. Sci. 2025, 28, 2905–2926. [Google Scholar] [CrossRef]
Cui, L.; Niu, X.; Qian, H.; Wang, X.; Xu, J. A Transformer-Based Approach for Efficient Geometric Feature Extraction from Vector Shape Data. Appl. Sci. 2025, 15, 2383. [Google Scholar] [CrossRef]
Wang, S.; Zhu, Y.; Zheng, N.; Liu, W.; Zhang, H.; Zhao, X.; Liu, Y. Change Detection Based on Existing Vector Polygons and Up-to-Date Images Using an Attention-Based Multi-Scale ConvTransformer Network. Remote Sens. 2024, 16, 1736. [Google Scholar] [CrossRef]
Huang, Z.; Qian, H.; Wang, X.; Lin, D.; Wang, J.; Xie, L. Graph neural network-based identification of ditch matching patterns across multi-scale geospatial data. Geocarto Int. 2023, 38, 2294900. [Google Scholar] [CrossRef]
Lawrence, S.; Giles, C.L.; Tsoi, A.C.; Back, A.D. Face recognition: A convolutional neural network approach. IEEE Trans. Neural Netw. 1997, 8, 98–113. [Google Scholar] [CrossRef] [PubMed]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef]
Huawei Technologies Co., Ltd. Petal Maps. Available online: https://petalmaps.dre.agconnect.link/ (accessed on 10 March 2025).
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8026–8037. [Google Scholar]
National Oceanic and Atmospheric Administration. Cartographic Generalization and Symbolization. In NOAA Technical Report NOS 127 CGS 12; NOAA Office of Coast Survey: Washington, DC, USA, 2009. [Google Scholar]

Figure 1. Segmentation and resampling process of curves.

Figure 2. Schematic diagram of convolution and transposed convolution computations. (The dashed bounding box indicates the convolutional operations already completed, while the solid bounding box highlights the operation currently in progress. Here, z¹ denotes the final latent vector derived from the convolutional encoding phase; z² represents the latent vector reconstructed via transposed convolution, specifically captured prior to the application of inverse normalization.)

Figure 3. Island boundary line data compression model based on convolutional autoencoder.

Figure 4. Island boundary line data. (The locator map in the top left corner illustrates their global distribution. A few islands are not listed.)

Figure 5. Variation of reconstruction loss and mean positional deviation during training. The positional deviation represents the average Euclidean distance between the reconstructed curves and the original curves.

Figure 6. Comparison of reconstructed line displacement and original line segment displacement at different compression ratios. (Convolutional autoencoder (CAE) indicates the curve reconstructed using a convolutional autoencoder.).

Figure 7. Comparison of offset value variations for the four methods under different compression ratios.

Figure 8. Visual comparison of island boundary reconstructions using four methods under different compression ratios. (a) A prominent cape with rapid directional change; (b) a smoothly varying coastline; (c) a deep concave bay. The rows correspond to these four regions, while the columns from left to right represent compression ratios (CR) of 2, 3.33, and 10, respectively.

Table 1. Mean point displacement under different compression ratios (corresponding distances at the 1:1 M map scale).

Input Vectors	Latent Vectors	Compression Ratio	Mean Point Displacement (m)	Mean Displacement at 1:1 M Scale (mm)
100	100	2	45.285	0.0453
100	80	2.5	44.59	0.0446
100	60	3.33	42.405	0.0424
100	40	5	62.107	0.0621
100	20	10	117.831	0.118

Table 2. Effect of convolution kernel size on mean positional deviation.

Input Vector	Compression Ratio	Kernel Size	Mean Positional Deviation
200	3.33	4	65.8034
200	3.33	7	42.405
200	3.33	9	49.6698
200	3.33	11	52.2515
200	3.33	13	51.2359

Table 3. Effect of segment length on model accuracy.

Input Vectors	Compression Ratio	Segment Length (m)	Mean Positional Deviation (m)
200	3.33	75,000	73.4166
200	3.33	50,000	52.2701
200	3.33	25,000	41.9819
200	3.33	15,000	46.0704

Table 4. Selection of compression ratios for island boundary lines at different map scales.

Map Scale	Allowable Deviation (m)	Within Threshold	Recommended Compression Ratio
1:5 K	1	No	None
1:10 K	2	No	None
1:25 K	5	No	None
1:50 K	10	No	None
1:100 K	20	No	None
1:250 K	50	Yes	30%
1:500 K	100	Yes	20%
1:1 M	200	Yes	10%

Table 5. Performance comparison of different models for vector curve compression.

Model	PD	RAE	RPE	IoU	CC	Train Time
CAE	42.41	0.0067%	0.9919%	0.9991	3.22 × 10⁻⁴	57.32
DP	81.98	0.0014%	0.7576%	0.9985	4.60 × 10⁻⁵	None
FCA	59.96	0.0117%	1.8345%	0.9988	3.52 × 10⁻⁴	51.94
FS	54.96	0.1483%	0.9538%	0.9921	2.06 × 10⁻⁴	None

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Zhang, S.; Liu, P.; Ma, H.; Guo, M. A Convolutional Autoencoder-Based Method for Vector Curve Data Compression. ISPRS Int. J. Geo-Inf. 2026, 15, 164. https://doi.org/10.3390/ijgi15040164

AMA Style

Zhang S, Liu P, Ma H, Guo M. A Convolutional Autoencoder-Based Method for Vector Curve Data Compression. ISPRS International Journal of Geo-Information. 2026; 15(4):164. https://doi.org/10.3390/ijgi15040164

Chicago/Turabian Style

Zhang, Shuo, Pengcheng Liu, Hongran Ma, and Mingwu Guo. 2026. "A Convolutional Autoencoder-Based Method for Vector Curve Data Compression" ISPRS International Journal of Geo-Information 15, no. 4: 164. https://doi.org/10.3390/ijgi15040164

APA Style

Zhang, S., Liu, P., Ma, H., & Guo, M. (2026). A Convolutional Autoencoder-Based Method for Vector Curve Data Compression. ISPRS International Journal of Geo-Information, 15(4), 164. https://doi.org/10.3390/ijgi15040164

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Convolutional Autoencoder-Based Method for Vector Curve Data Compression

Abstract

1. Introduction

2. Model and Methods

2.1. Problem Formulation and Curve Representation

2.2. Data Preprocessing

2.2.1. Curve Segmentation

2.2.2. Curve Resampling

2.2.3. Data Normalization

2.3. Convolutional Autoencoder-Based Compression Model

2.3.1. Convolutional Autoencoder

2.3.2. Data Compression with a Convolutional Autoencoder

2.3.3. Restoration of Island Boundary Data

3. Experimental Setup

3.1. Data Sources and Preprocessing

3.2. Implementation and Training Settings

3.3. Evaluation Metrics

3.4. Baseline Methods

3.4.1. Douglas–Peucker Algorithm

3.4.2. Fourier Series-Based Curve Compression

3.4.3. Fully Connected Autoencoder-Based Compression

4. Experimental Results

4.1. Model Accuracy

4.2. Impact of Compression Ratios

4.3. Comparative Results and Discussion

5. Analysis and Discussion

5.1. Geometric Scale Sensitivity of the Convolutional Autoencoder

5.2. Effect of Segment Length on Compression Stability and Accuracy

5.3. Compression Ratio Recommendations for Multi-Scale Island Visualization

5.4. Performance Comparison of Different Models

6. Conclusions and Future Work

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI