Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components

Yao, Chendong; Huang, Kaixin; Lv, Ke; Ye, Sichao; Zhuang, Jiayan

doi:10.3390/app16062713

Open AccessArticle

Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components

by

Chendong Yao

¹

,

Kaixin Huang

¹

,

Ke Lv

²

,

Sichao Ye

² and

Jiayan Zhuang

^2,*

¹

Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China

²

Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(6), 2713; https://doi.org/10.3390/app16062713

Submission received: 22 January 2026 / Revised: 6 March 2026 / Accepted: 10 March 2026 / Published: 12 March 2026

Download

Browse Figures

Versions Notes

Abstract

Point cloud completion for industrial mechanical components remains challenging due to reflections, self-occlusions, sparse sampling, and strict takt-time constraints in production, which often lead to large missing regions and incomplete fine structures. Meanwhile, industrial parts exhibit strong geometric regularities and prominent sharp features, making purely global feature-driven completion prone to structural drift and blurred boundaries. To address these issues, we propose a structure-prior-guided point cloud completion framework for industrial workpieces. Our method follows an encoder–decoder design with a coarse-to-fine generation strategy to balance global consistency and local geometric details. It enhances feature representation via local graph enhancement and relative-position attention, and further injects a primitive decomposition prior from ParSeNet into progressive decoding to condition point generation and displacement refinement. Experiments on industrial CAD datasets such as CADNET demonstrate that our approach achieves higher geometric fidelity and structural integrity under varying occlusion conditions, and also yields superior performance in downstream surface reconstruction evaluation compared with existing methods.

Keywords:

industrial point cloud completion; surface-based segmentation; surface reconstruction

1. Introduction

Point cloud completion refers to the process of inferring and recovering a complete shape from a partial and incomplete point cloud while maintaining the consistency of both the overall shape and local details [1,2]. For local details, the structure of the original incomplete input should be preserved, and the generated parts should be as accurate as possible without noise. In practical acquisition scenarios, due to limitations such as sensor viewpoints, surface reflections, and self-occlusions, the captured point clouds are often incomplete [3,4,5,6]. This significantly impacts the direct application of point clouds and subsequent analysis, making the recovery of a complete point cloud that reflects the real-world situation an important research direction [7,8,9].

In the industrial domain, point clouds are widely utilized for dimensional measurement, reverse engineering, and defect detection within manufacturing quality control. Consequently, point cloud completion serves as a vital preprocessing step to ensure the reliability of these downstream applications. It is important to emphasize that industrial mechanical components fundamentally differ from the generic objects commonly found in standard datasets such as ShapeNet [10,11]. While generic objects often exhibit rich semantic information and category-level shape priors, industrial parts are primarily characterized by precise geometric relations rather than high-level semantics. Their geometric structures are governed by strict engineering constraints, including surface continuity, strict orthogonality and parallelism, curvature consistency, as well as the prominent presence of sharp edges and well-defined boundaries that are ubiquitous in CAD models. In contrast, conventional graphical models and everyday objects typically exhibit smoother curvature distributions with far fewer sharp features.

In recent years, the rapid development of deep learning has led to the emergence of a large number of neural-network-based point cloud completion methods, significantly improving recovery quality under partial observations [12,13]. Currently, mainstream deep learning-based point cloud completion methods excel in global feature extraction but fail to fully capture object details, recovering the general shape of the object but not ensuring the local features and surface details. For example, existing methods typically model each object category as a whole. Such global strategies are effective for free-form objects with strong semantic priors, but are often less suitable for industrial mechanical parts, whose geometry is dominated by planar faces, cylindrical surfaces, sharp edges, and well-defined boundaries. Therefore, semantic-driven completion strategies that perform well on generic objects may struggle to preserve critical geometric structures in industrial scenarios, thereby compromising the reliability of downstream tasks. To better reflect real-world industrial conditions, this work focuses on the CADNET dataset [11], which is specifically designed for mechanical components. CADNET is constructed from high-precision CAD models with explicit geometric primitives and sharp surface transitions, and thus emphasizes geometric accuracy and structural fidelity rather than semantic diversity, making it more representative of real industrial scenarios.

To address the aforementioned challenges, we propose a structure-prior-guided point cloud completion framework tailored for industrial workpieces. The method follows an encoder–decoder architecture with a coarse-to-fine progressive generation strategy. Specifically, on the encoding side, we propose an enhanced representation learning module, termed SLR, which enhances feature extraction for geometry, yielding a geometry-aware global shape representation. On the decoding side, we introduce a structure-conditioned progressive refinement module, SC-SPD. Unlike existing methods that rely solely on raw geometry and global features, SC-SPD further utilizes a primitive decomposition prior, which is propagated stage-by-stage to the evolving point sets. This enables primitive-aware refinement behavior that better preserves sharp features and regular surfaces while reducing structural ambiguities and errors in surface connections. To validate the effectiveness of our method, we conduct experiments on the CADNET dataset, providing evaluation metrics and visual analysis. The experimental results demonstrate that our method produces point clouds with higher geometric fidelity and stronger structural consistency, and enhanced robustness against input noise, achieving superior reconstructed surface quality compared to representative baseline methods.

Our main contributions are summarized as follows:

We propose the SLR module for feature extraction from input point clouds, which effectively captures local structural features and enhances detail preservation, tailored for industrial components with strict engineering constraints.
We introduce the SC-SPD module, which integrates the extracted primitive decomposition prior and propagates it through the point cloud refinement process, enabling structure-aware progressive refinement that better preserves sharp features and regular surfaces while reducing structural ambiguities and erroneous connections.
Extensive experiments and visualizations demonstrate the novel performance of our method, which effectively recovers the overall shape of industrial point clouds while ensuring local geometric features are preserved.

2. Related Work

2.1. Encoder–Decoder-Based Point Cloud Completion

Early learning-based point cloud completion methods commonly adopt an encoder–decoder pipeline that maps partial observations into a latent representation and decodes it into a complete point set. PCN [14] is a representative end-to-end framework that generates completions in a coarse-to-fine manner, producing a sparse coarse shape followed by a refinement stage to increase point density and local detail. Several works explore structured decoders to better model complex geometry. TopNet [15] proposes a rooted-tree decoder that recursively splits features to generate point sets with flexible topology, alleviating strong surface-manifold assumptions. PF-Net [16] introduces a multi-scale feature fusion strategy to progressively recover missing regions by leveraging fine-grained features at different resolutions. In addition, FoldingNet [17] proposes a folding-based decoder that deforms a canonical 2D grid into a 3D surface. Although originally designed for point cloud auto-encoding/representation learning, its folding operation has inspired later completion decoders to improve local continuity and surface-like regularity. Despite their effectiveness, approaches that rely heavily on a single global latent vector may still struggle to preserve sharp edges and intricate local topology, which is particularly challenging for industrial parts with complex structures.

2.2. Refinement-Based Point Cloud Completion

To alleviate over-smoothed details and improve fidelity, a line of research investigates progressive refinement strategies. GRNet [18] introduces 3D grids as intermediate representations to regularize unordered point clouds and uses volumetric features to enhance global structural consistency before converting results back to point sets. SA-Net [19] proposes a skip-attention mechanism to fuse informative local-region features from the encoder into the decoder at multiple resolutions, reducing information loss during feature compression. SnowflakeNet [20] models completion as snowflake-like point growth via Snowflake Point Deconvolution (SPD), where points are progressively split and refined with a skip-transformer to generate dense outputs. Instead of directly generating unordered point sets, PMP-Net formulates completion as a point-moving process and learns multi-step moving paths from partial to complete shapes, explicitly modeling deformation trajectories [21]. For topology-aware completion, LAKe-Net [22] localizes aligned keypoints and adopts a Keypoints–Skeleton–Shape prediction manner to recover missing topology with skeleton-assisted refinement. Moreover, VRCNet [23] proposes a variational framework with a dual-path architecture for probabilistic modeling, enhances results via relational modules, and also introduces the MVP dataset for multi-view partial-to-complete evaluation.

2.3. Transformer-Based Point Cloud Completion

Transformers [24] have recently shown strong performance in point cloud completion due to their ability to model long-range dependencies and global structure. PoinTr [25] reformulates completion as a set-to-set translation problem by representing point clouds as a sequence of point proxies and employing geometry-aware transformer blocks to capture geometric relations and symmetries. SeedFormer [26] introduces “Patch Seeds” as region-level shape representations and devises an upsampling transformer to propagate and refine patch cues throughout decoding, enabling high-fidelity detail recovery. AnchorFormer [27] further leverages pattern-aware discriminative nodes (anchors) to capture diverse regional patterns beyond a single global feature vector, improving the reconstruction of complex local structures. Beyond pure transformer pipelines, ODGNet [28] proposes an orthogonal dictionary guided completion framework with a seed generation U-Net and a dictionary guidance module to mitigate completion bottlenecks and improve structural consistency.

2.4. Dataset Introduction

The availability of large-scale 3D benchmarks is a prerequisite for the development of data-driven point cloud completion methods. Most existing approaches are initially evaluated on generic object datasets, among which ShapeNet, organized according to the WordNet [29] taxonomy, serves as the primary repository of semantically annotated CAD models and forms the foundation of many derived benchmarks. Built upon ShapeNet, PCN has become a standard dataset, comprising 30,974 partial–complete point cloud pairs across eight object categories. Instead of random point removal, PCN generates partial inputs by back-projecting 2.5D depth maps into 3D space, thereby simulating realistic self-occlusions and requiring models to upsample sparse inputs of 2048 points to dense outputs of 16,384 points. Complementary to PCN, Completion3D focuses on high-fidelity local geometric recovery and provides approximately 29,000 samples with fixed input–output resolutions. To further increase viewpoint diversity, the MVP dataset [23] renders each object from 26 uniformly distributed camera poses on a surrounding sphere, resulting in over 62,000 samples and enabling robustness evaluation under arbitrary rotations. In addition, recent evaluation protocols such as ShapeNet-55 and ShapeNet-34 extend completion benchmarks to unseen categories and diverse manifold topologies, facilitating zero-shot generalization analysis. Beyond synthetic CAD-based datasets, the KITTI dataset [30] offers sparse and noisy LiDAR scans of vehicles captured in real-world autonomous driving scenarios and is commonly used as a qualitative benchmark for sim-to-real evaluation in the absence of dense ground-truth point clouds.

While these generic object datasets have driven substantial progress in point cloud completion, their geometric characteristics differ markedly from those encountered in industrial applications. In contrast to everyday objects, industrial components are typically dominated by regular geometric primitives, such as planar and cylindrical surfaces, and are characterized by sharp edges and strict structural constraints. Such properties are only weakly represented in generic object datasets, where shapes are often smooth and semantically driven. As a result, performance on these benchmarks does not necessarily reflect a model’s effectiveness in industrial workpiece completion scenarios. Given that this work primarily targets point cloud completion for industrial parts, we therefore adopt the CADNET dataset, which is more representative of practical industrial geometry, as the benchmark for subsequent experiments.

3. Method

3.1. Overall Architecture

In practical industrial scenarios, point cloud acquisition for workpieces is constrained not only by sensing artifacts but also by strict throughput requirements. Besides common issues such as specular reflections, self-occlusions, and missing returns on glossy or dark materials, complete scanning is often infeasible within the production cycle time: multi-view capture may be limited by fixture/robot reachability, safety clearance, and line layout, while re-positioning or long dwell-time scanning increases takt time and interrupts on-line inspection. Consequently, the observed point sets are frequently sparse, single-/few-view, and partially missing in occluded cavities or fine structures, which makes industrial point cloud completion a structurally demanding task.

As illustrated in Figure 1, our method follows an encoder–decoder design. The network consists of three components: (i) an improved feature extractor named SLR which includes Set Abstraction (SA), Local Graph Enhancement (LGE), and Relative-Position Self-Attention (RelPos-SA) extracts a global shape code

f_{g}

from the partial input; (ii) a coarse seed generator that produces an initial coarse point set

{\hat{Y}}^{(0)}

conditioned on

f_{g}

; and (iii) a structure-conditioned decoder composed of multiple Snowflake Point Deconvolution (SPD) blocks for progressive upsampling and displacement refinement. To enhance structural consistency for industrial parts, we inject a decomposition-based structural prior from ParSeNet [31] into the decoding stage to guide point generation and refinement.

In the encoder stage, the incomplete input point cloud

P

is fed into the proposed SLR module for representation learning. Built upon a PointNet++ [32]-style hierarchical set abstraction (SA) backbone with kNN grouping, SLR progressively downsamples the points into multi-scale anchor sets and aggregates local neighborhoods to capture geometric cues from local to global scales. To better characterize industrial workpieces that often contain sharp edges, rims, and other geometric discontinuities, SLR integrates a lightweight Local Graph Enhancement (LGE) component after SA outputs, where edge features are constructed and aggregated to improve boundary sensitivity. Meanwhile, SLR incorporates Relative-Position Self-Attention (RelPos-SA) to capture long-range dependencies such as symmetry and global part relationships by augmenting content-based attention with a learned relative-geometry bias, yielding a more geometry-aware and translation-robust global representation. The SLR module finally outputs a compact global shape code

f_{g}

, which conditions subsequent point generation and refinement.

Next, a coarse seed generator [16,17,20] expands

f_{g}

and regresses an initial coarse point set

{\hat{Y}}^{(0)}

that provides a rough shape scaffold. To better preserve observed regions and stabilize downstream refinement, we concatenate

{\hat{Y}}^{(0)}

with the partial input

P

and apply farthest point sampling (FPS) [32] to obtain an initialization set

{\hat{Y}}_{p 0}^{(0)}

, which serves as the starting point set for progressive decoding.

In the decoder stage, we perform structure-conditioned decoding and progressive refinement using the proposed structure conditioned Snowflake Point Deconvolution modules (SC-SPD). Specifically, we leverage a pretrained and frozen ParSeNet to predict a soft primitive-membership distribution

S

on the partial input

P

, and propagate it to the current-stage anchor set via kNN inverse-distance interpolation, yielding a stage-wise structural condition

\tilde{S}

. SC-SPD follows the progressive upsampling philosophy of Snowflake Point Deconvolution (SPD), where each stage upsamples the anchor set and predicts displacement offsets for refinement. Different from the original SPD, SC-SPD injects

\tilde{S}

through FiLM modulation on the feature stream, enabling primitive-aware feature scaling and shifting so that distinct structural regions exhibit different refinement behaviors. After several SC-SPD stages of progressive upsampling and displacement refinement, the network outputs the final high-resolution completed point set

\hat{Y}

.

Owing to this coarse-to-fine, multi-scale generation design, high-level global context

f_{g}

provides holistic structural guidance for low-level point synthesis, while coarse-stage geometry effectively propagates local cues to high-resolution predictions. Moreover, the ParSeNet prior supplies consistent region-level structural constraints across stages, improving structural fidelity and preserving sharp industrial features in the completed shapes.

3.2. SLR Module

SLR takes a partial point cloud

P \in R^{3 \times N}

as input and outputs a compact global shape code

f_{g} \in R^{C \times 1}

.

We initialize coordinates and point-wise features as

X^{(0)} = P

and

F^{(0)} = P

. SLR then adopts a PointNet++-style SA backbone with kNN grouping to progressively downsample the point set into multi-scale anchor sets and aggregate local neighborhoods, producing intermediate representations

(X^{(t)}, F^{(t)})

. This step mainly provides multi-scale geometric abstraction and information pooling, and prepares anchor features for subsequent structure-aware refinement.

As illustrated in Figure 2, a local graph enhancement (LGE) module is applied at each intermediate scale to strengthen boundary-sensitive representations by explicitly modeling local geometric discontinuities. For each intermediate anchor set

(X^{(t)}, F^{(t)})

, we apply LGE to emphasize local discontinuities such as sharp edges and rims. We build a kNN graph in Euclidean space using anchor coordinates

X^{(t)}

, obtaining neighbor indices

N_{k} (i)

. For each anchor i and neighbor

j \in N_{k} (i)

, we construct an edge feature that jointly encodes feature variation and relative geometry:

e_{i j} = [f_{j} - f_{i}, f_{i}, x_{j} - x_{i}] .

(1)

A shared MLP maps

e_{i j}

to messages, and we aggregate them by max pooling over the neighborhood to obtain the anchor-wise message

m_{i} = {max}_{j \in N_{k} (i)} ϕ (e_{i j})

. The aggregated message is fused and added back to the original feature via a residual update with non-linearity,

f_{i} = σ (ψ (m_{i}) + f_{i})

, yielding boundary-enhanced features

{\tilde{F}}^{(t)}

while preserving the anchor resolution.

As illustrated in Figure 3, the proposed module applies a lightweight self-attention mechanism augmented with relative-position bias to capture long-range geometric dependencies among points. Given

(X^{(t)}, {\tilde{F}}^{(t)})

, we apply a lightweight self-attention block with a learned relative-position bias to model long-range dependencies such as symmetry and global part relations. Specifically, we compute multi-head queries/keys/values from

{\tilde{F}}^{(t)}

, denoted as

Q, K, V

, and form content logits

Q K^{⊤} / \sqrt{d}

. In parallel, we compute pairwise relative offsets

Δ x_{i j} = x_{i} - x_{j}

and map them to a per-head bias using an MLP, yielding

B_{i j} = MLP (Δ x_{i j})

. The final attention weights are obtained by combining content similarity and relative geometry:

A = softmax (\frac{Q K^{⊤}}{\sqrt{d}} + B) .

(2)

The attention output

A V

is linearly projected, added back through a residual connection, and further refined by a point-wise MLP, producing globally consistent and geometry-aware anchor features

F_{out}^{(t)}

.

3.3. Coarse Seed Generation

Given the global feature

f_{g}

, the seed generator expands

f_{g}

into per-seed latent codes via transposed convolution and refines them using residual MLP blocks. A coordinate regression head then outputs a coarse point set

{\hat{Y}}^{(0)} \in R^{N_{c} \times 3}

. For better coverage of observed regions, we concatenate

{\hat{Y}}^{(0)}

with the partial input

P

and apply FPS to obtain an initialization set

{\hat{Y}}_{p 0}^{(0)}

with

N_{p 0}

points, which serves as the input to subsequent refinement stages. To balance structural coverage and computational efficiency, the total number of coarse seeds

N_{c}

and the initial sampled points

N_{p 0}

are determined empirically based on the target dense resolution (8192 points) and the average missing ratio of the CADNET dataset. Specifically, we set

N_{c} = 512

and

N_{p 0} = 256

, which provides a sufficient geometric skeleton for the subsequent dense upsampling stages without incurring excessive computational overhead during the early progressive generation phase.

3.4. SC-SPD Module

Industrial workpieces typically follow strong engineering regularities and can often be interpreted as compositions of regular geometric primitives and their combinations, such as planes, cylinders, cones, and fillets/chamfers. When completion relies solely on raw geometry and a global feature code, severe missing regions and occlusions may lead to structural drift, for example rounded sharp boundaries, misaligned holes, or locally “free-form” surfaces that violate the CAD-like design intent. To explicitly inject such structural constraints into the generation process, we introduce a decomposition-based structural prior and use it to guide progressive decoding in SC-SPD, improving structural consistency and interpretability.

As illustrated in Figure 4, the decoder is guided by an explicit structure prior extracted from a pretrained primitive decomposition network. In our framework, the ParSeNet module is pre-trained on the ABCParts dataset, a subset of the ABC CAD dataset [33] consisting of 32,000 3D mechanical CAD models. For the training protocol, we followed the official three-stage procedure provided by the original authors: (i) pre-training the decomposition module using metric learning, (ii) pre-training the B-spline fitting network, and (iii) joint end-to-end training of all modules. We employ this pretrained and frozen ParSeNet [31] as a prior extractor. Freezing ParSeNet provides stable and consistent structural hints, preventing the primary completion network from diverging during the early stages of training. This strategy also prevents the prior from drifting during end-to-end training and stabilizes optimization, while providing a consistent notion of primitive-aware partitioning across all decoding stages. For each partial point

p_{i}

, ParSeNet predicts a soft primitive-membership vector

s_{i} \in R^{D}, \sum_{d = 1}^{D} s_{i, d} = 1,

(3)

where D is the number of primitive categories. Collectively, we denote the prior on the partial input as

S = {s_{i}}_{i = 1}^{N} \in R^{N \times D} .

(4)

Compared with clustering ParSeNet embeddings and converting cluster IDs to one-hot labels, soft probabilities naturally encode uncertainty near primitive boundaries and avoid additional clustering steps, resulting in more stable and efficient inference. Therefore, we adopt

S

as the structural condition by default.

Since the decoder progressively generates new point sets, the structural prior must be aligned with the evolving geometry. Let the current parent anchor set at a decoding stage be

X = {x_{m}}_{m = 1}^{N_{x}}

, whose distribution may differ from the original partial input

P

. We propagate the prior from

P

to

X

via kNN interpolation in the Euclidean space. For each

x_{m}

, we query its k nearest neighbors

N_{k} (x_{m})

in

P

and compute inverse-distance weights based on squared distances,

w_{m, i} = \frac{1}{{∥ x_{m} - p_{i} ∥}_{2}^{2} + ϵ} / \sum_{r \in N_{k} (x_{m})} \frac{1}{{∥ x_{m} - p_{r} ∥}_{2}^{2} + ϵ},

(5)

where

ϵ

is a small constant for numerical stability. The propagated structural prior for

x_{m}

is obtained by a weighted sum of its neighbors’ priors,

{\tilde{s}}_{m} = \sum_{i \in N_{k} (x_{m})} w_{m, i} s_{i},

(6)

yielding the stage-wise prior

\tilde{S} \in R^{N_{x} \times D}

. This stage-wise prior serves as a structural condition in SC-SPD to modulate feature learning and displacement prediction, enabling primitive-aware generation and refinement across different structural regions.

For simplicity, we denote the number of parent anchors at the current stage as N, i.e.,

N = N_{x}

. Each SC-SPD block inherits the upsampling-and-refinement design of Snowflake Point Deconvolution. It takes a parent anchor set

X \in R^{3 \times N}

, the global feature

f_{g} \in R^{C \times 1}

, an optional displacement feature from the previous stage

K_{prev} \in R^{128 \times N}

, and the propagated structural prior

\tilde{S} \in R^{N \times D}

. The block outputs an upsampled child anchor set

X^{'} \in R^{3 \times (N r)}

and the current displacement feature

K_{curr} \in R^{128 \times (N r)}

, where r is the upsampling factor. Compared with the original SPD, SC-SPD injects

\tilde{S}

through FiLM modulation at both the parent feature stage and the child refinement stage.

Given parent coordinates

X

, the block first extracts anchor-wise features from coordinates using a shared MLP, producing

F_{1} \in R^{128 \times N}

. When global conditioning is enabled, we concatenate

F_{1}

with its channel-wise max pooled statistics replicated over anchors and with the global code

f_{g}

replicated over anchors, and feed the concatenated feature into another MLP to obtain the parent displacement feature

Q \in R^{128 \times N}

. We then apply FiLM conditioning using the propagated prior: for each anchor i,

{\tilde{s}}_{i}

is mapped by a lightweight MLP to FiLM parameters

(γ_{i}, β_{i})

, and

Q

is modulated as

Q_{i} \leftarrow Q_{i} ⊙ (1 + tanh (γ_{i})) + β_{i} .

(7)

This primitive-aware scaling and shifting differentiates refinement behaviors across structural regions under the same global shape code. Crucially, this mechanism also enables dynamic feature weighting to handle incorrect primitive assignments. When ParSeNet encounters highly ambiguous missing regions or makes an erroneous assignment, its output distribution typically exhibits higher entropy (i.e., uncertainty). During the training process, the network learns to dynamically adjust the influence of these prior features through the FiLM layers. If the prior signal is uncertain or contradicts the strong geometric context, the modulation mechanism naturally down-weights the prior’s influence.

Next, we use a SkipTransformer to fuse the modulated

Q

with the previous-stage displacement feature

K_{prev}

when available. In the first decoding stage where

K_{prev}

is absent, we directly use

Q

as the skip feature. The SkipTransformer outputs an intermediate feature

H \in R^{128 \times N}

.

To upsample anchors, we follow the splitting mechanism in SPD. We map

H

to a splitting code and expand it by a transposed convolution to produce child features

F_{child} \in R^{128 \times (N r)}

. Meanwhile, we upsample

H

by nearest-neighbor duplication with scale factor r to obtain

{Up}_{r} (H) \in R^{128 \times (N r)}

, where

{Up}_{r} (\cdot)

denotes nearest-neighbor upsampling by duplication with factor r. Concatenating

F_{child}

and

{Up}_{r} (H)

and passing them through a residual MLP yields the child displacement feature

K_{curr} \in R^{128 \times (N r)}

.

After splitting, we align the parent prior to child anchors by repeating

\tilde{S}

according to the upsampling factor r, obtaining

{\tilde{S}}_{child} \in R^{(N r) \times D}

. We then apply a second FiLM modulation on

K_{curr}

using

{\tilde{S}}_{child}

, enforcing region-consistent refinement for newly generated child anchors.

Finally, a regression head predicts coordinate offsets

Δ X

from

K_{curr}

. For stability, we bound offsets using tanh and apply stage-wise scaling, implemented as

Δ X \leftarrow tanh (Δ X) / (r a d i u s^{ℓ})

, where ℓ is the stage index. The child coordinates are obtained by first upsampling parent coordinates using nearest-neighbor duplication

{Up}_{r} (X)

and then adding the bounded offsets,

X^{'} = {Up}_{r} (X) + Δ X .

(8)

By conditioning both feature formation and displacement refinement with ParSeNet priors, SC-SPD learns primitive-aware refinement strategies suited for CAD-like industrial geometry, reducing structural drift and better preserving sharp features.

4. Experiments

4.1. Dataset and Settings

DataSet Preprocessing

To rigorously evaluate the effectiveness of the proposed point cloud completion framework in real-world industrial settings, we conduct comprehensive experiments on the CADNET dataset. CADNET consists of high-precision CAD models of industrial products spanning multiple categories. These models exhibit highly regular and accurate geometric characteristics, including large planar regions, rotational symmetries, and well-defined sharp boundaries, which are representative of industrial workpieces and pose significant challenges for geometry-sensitive point cloud completion methods.

For data preprocessing, each CAD model is uniformly sampled into a fixed-size point cloud and normalized by centering at the origin and scaling to fit within a unit sphere. This normalization ensures consistent scale and alignment across samples, facilitating stable network training and fair comparison among different methods.

To simulate incomplete observations commonly encountered in practical industrial scanning processes, we adopt a multi-view occlusion-based missing data generation strategy instead of random point removal. Specifically, for each complete point cloud, we randomly sample 8 viewpoints uniformly distributed on a viewing sphere surrounding the object. Under each viewpoint, points that are invisible according to a depth-based visibility test are removed, effectively mimicking occlusions caused by limited sensor viewpoints and self-occlusion. This process produces partial point clouds with large contiguous missing regions, which are more representative of real industrial scanning artifacts than random point dropping. Multiple partial observations are generated for each CAD model, resulting in a total of

26, 536

partial point clouds used for training and evaluation. For the dataset split, we randomly select 80% of the objects from each category to form the training set, reserving the remaining 20% for evaluation. Additionally, the partial point clouds are generated by randomly sampling 2048 points from the object surfaces, while the corresponding complete ground-truth point clouds consist of 8192 points.

Implementation Details

All point-cloud completion experiments are conducted on a workstation running Ubuntu 24.04, equipped with an NVIDIA RTX A5000 GPU (NVIDIA Corporation, Santa Clara, CA, USA) with 24 GB of RAM, an Intel Core i7-12700K CPU (Intel Corporation, Santa Clara, CA, USA) and 64 GB of system memory. The proposed method is implemented in Python 3.8.20 using the PyTorch 2.1.1 (Meta Platforms, Inc., Menlo Park, CA, USA) deep learning framework with CUDA 11.8 support. During training, the AdamW optimizer is employed with an initial learning rate of

5 \times 10^{- 4}

and a weight decay of

5 \times 10^{- 4}

. The learning rate is dynamically adjusted using a LambdaLR scheduler, which decays the learning rate by a factor of 0.9 every 21 epochs. The batch size is set to 32, and the network is trained for a total of 300 epochs. The subsequent mesh surface reconstruction using PPSurf is performed on a separate server equipped with 4× NVIDIA A100 (40GB) GPUs (NVIDIA Corporation, Santa Clara, CA, USA).

Evaluation Metrics

We follow prior point cloud completion works [25,34] and adopt the mean Chamfer Distance as our primary evaluation metric, which measures the set-to-set discrepancy between the predicted point cloud and the ground-truth shape. In this paper, we report the

ℓ_{1}

variant, denoted as

{CD}_{ℓ_{1}}

. For each prediction, let the completed point set be

P = {p_{i}}_{i = 1}^{| P |}

and the ground-truth point set be

G = {g_{j}}_{j = 1}^{| G |}

. The Chamfer Distance is computed as

{CD}_{ℓ_{1}} (P, G) = \frac{1}{| P |} \sum_{p \in P} min_{g \in G} {∥ p - g ∥}_{1} + \frac{1}{| G |} \sum_{g \in G} min_{p \in P} {∥ g - p ∥}_{1} .

(9)

In addition, following [35], we also report the F-Score as a complementary metric, which evaluates the matching quality between two point sets under a fixed distance threshold

τ

and reflects both precision and recall. Specifically, the precision and recall are defined as

Precision (P, G) = \frac{1}{| P |} \sum_{p \in P} I (min_{g \in G} {∥ p - g ∥}_{2} < τ),

(10)

Recall (P, G) = \frac{1}{| G |} \sum_{g \in G} I (min_{p \in P} {∥ g - p ∥}_{2} < τ),

(11)

where

I (\cdot)

is the indicator function. The F-Score is then computed as the harmonic mean of precision and recall:

F - Score = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall} .

(12)

In our experiments, since all point clouds are pre-normalized into a unit bounding box during data preprocessing, the distance threshold

τ

is strictly set to 1% of the normalized bounding box dimension (i.e., an absolute distance of 0.01). This restrictive threshold ensures that the predicted points must align closely with the ground truth boundaries to be considered accurate.

4.2. Quantitative Evaluation

In this section, we conduct a comprehensive comparison between the proposed method and several representative point cloud completion approaches, including PoinTr, PMPNet, Seedformer, and ODGNet. These methods span a wide range of completion paradigms, such as encoder–decoder architectures, hierarchical refinement strategies, and Transformer-based global modeling, thereby providing a solid benchmark for evaluating the effectiveness of our approach.

Since existing methods are originally trained and evaluated on different datasets with varying data distributions, we retrain all competing methods on the same CADNET dataset using identical training–validation splits to ensure a fair and unbiased quantitative comparison. For each baseline, we strictly follow the original network architectures and training protocols described in their respective papers, without introducing any additional geometric priors, surface annotations, or auxiliary supervision beyond the input incomplete point clouds. All methods operate solely on partial point clouds as input, ensuring that performance differences arise from model design rather than external cues.

For quantitative evaluation, we adopt two widely used metrics for point cloud completion and reconstruction: Chamfer Distance (

{CD}_{ℓ_{1}}

) and the F1 score.

{CD}_{ℓ_{1}}

measures the average point-wise distance and is relatively robust to outliers. The F1 score jointly considers precision and recall under a predefined distance threshold, providing an overall assessment of structural completeness and geometric consistency in the reconstructed point clouds.

The results in Table 1 show that, on the CADNET dataset, our method achieves the best overall performance among all compared approaches. Specifically, it obtains the lowest

{CD}_{ℓ_{1}}

and the highest F-Score, consistently outperforming the competing methods. This indicates that our framework can more effectively reduce the geometric discrepancy between the completed point clouds and the ground-truth shapes, leading to more complete and structurally consistent results.

4.3. Qualitative Evaluation

To provide a more intuitive understanding of the completion quality of different methods on industrial workpieces, we visualize the completed point clouds and present qualitative comparisons in Figure 5. Compared with relying solely on point-wise metrics such as Chamfer Distance and F-score, visual inspection can more directly reveal how well the completed point clouds recover both the global shape and local geometric details, making the differences among completion results easier to interpret and assess.

Figure 5 presents overall qualitative comparisons of point cloud completion results, providing an intuitive view of how different methods recover complete shapes across multiple categories of industrial workpieces. Compared with existing approaches, our method demonstrates superior completion quality, yielding more faithful geometric structures and cleaner boundaries across different industrial categories. To further analyze the capability of each method in recovering local critical structures, Figure 6 provides a finer-grained, zoomed-in visualization for detailed comparison. Specifically, we select a representative L-shaped block, whose bottom part contains a regular square base, while a corner region is missing at the lower-right side. This missing part corresponds to a typical configuration where sharp right-angle edges intersect with planar surfaces. Such structures follow explicit geometric constraints in industrial components and therefore serve as a sensitive region for diagnosing over-smoothing and structural drift.

As shown in Figure 6, the completion quality around the missing corner differs substantially among methods. Although some baseline approaches can fill the missing region and increase point coverage, they often exhibit excessive smoothing near the right-angle edge, causing the originally sharp corner to become “rounded”. This is manifested by blunted edges and a softened transition between intersecting planes, which compromises the design intent and structural integrity of the workpiece. In contrast, our method more accurately recovers the right-angle planar structure in the missing region: the completed points maintain better planarity and consistency on both adjacent faces, and the transition at the edge remains sharper and more clearly delineated, avoiding undesired local curvature or spurious connections. These observations suggest that the proposed structure guidance and progressive refinement mechanism provides more stable geometric constraints around key sharp-feature regions, better meeting the industrial requirement of faithful structure preservation and geometric consistency.

As a fundamental building block for many downstream applications, point cloud completion plays a critical role in 3D reconstruction and geometric analysis. While commonly used metrics such as Chamfer Distance and other point-wise measures provide convenient quantitative assessments, they are insufficient for fully characterizing the perceptual and structural quality of completed point clouds. In particular, these metrics are often insensitive to surface continuity, sharp feature preservation, and topological correctness, which are essential for CAD-like industrial workpieces and directly affect the usability of the completed geometry in subsequent inspection and reconstruction pipelines.

To obtain a more comprehensive and application-oriented evaluation, we therefore further assess the completed point clouds through downstream 3D surface reconstruction experiments. Specifically, we reconstruct explicit surface representations from the completed point sets and evaluate how well the recovered geometry supports surface reconstruction, providing a more geometry-aware assessment beyond point-wise distances. We adopt PPSurf [36] as a unified surface generator for all methods, and perform reconstruction under the same hyper-parameter settings to ensure a fair comparison and to eliminate confounding factors introduced by different reconstruction configurations.

As illustrated by the surface reconstruction visualizations in Figure 7, the meshes reconstructed from point clouds completed by our method exhibit the closest resemblance to the ground-truth surfaces. They better preserve sharp edges and hole structures while maintaining smoother and more consistent surfaces on large planar or regular regions, with fewer artifacts in occluded or severely missing areas. Overall, these results demonstrate superior surface smoothness, structural integrity, and geometric fidelity compared to competing approaches, validating the effectiveness of our structure-prior-guided completion in industrial scenarios.

4.4. Robustness to Noise

To simulate the measurement inaccuracies and sensor noise commonly encountered in real-world industrial 3D scanning, we systematically evaluated the robustness of our method against Gaussian noise. Specifically, we injected random Gaussian noise into the input partial point clouds. To ensure consistent noise intensity across different workpiece scales, the standard deviation (

σ

) of the noise was set to 1%, 2%, and 3% of the bounding box diagonal length of each point cloud, and the noise was strictly clipped to

[- 2 σ, 2 σ]

to prevent physically implausible outliers.

As shown in Table 2, while the completion performance of all methods naturally degrades as the noise level increases, our proposed framework consistently outperforms the representative baseline models (SeedFormer and ODGNet) across all noise settings. Notably, under the severe noise level (3%), ODGNet and SeedFormer suffer from significant geometric degradation, resulting in

{CD}_{ℓ_{1}}

errors of 10.72 and 9.84, respectively, whereas our method maintains a comparatively lower error of 8.58. This robustness is further corroborated by the qualitative visualization in Figure 8. As illustrated by the generated results, our method demonstrates strong resilience against 1% and 2% noise levels, successfully recovering the complete shape without suffering from severe structural drift. When subjected to the more extreme 3% noise level, structural changes and local degradations become evident.

4.5. Ablation Study

To further validate the effectiveness of our proposed framework and quantify the contribution of each component to industrial point cloud completion, we conduct ablation studies as follows: (1) we establish a baseline model (denoted as Base) that replaces the proposed SLR encoder with a vanilla 3-layer Set Abstraction (SA) PointNet++ backbone and utilizes the original SPD module as the decoder instead of the SC-SPD; (2) we remove the proposed SLR encoder enhancements and use a vanilla PointNet++-style SA backbone for feature extraction, where both the Local Graph Enhancement (LGE) and the Relative-Position Self-Attention (RelPos-SA) are disabled (denoted as w/o SLR); (3) we remove the ParSeNet decomposition prior and disable structure conditioning in the decoder by turning off FiLM modulation in all SC-SPD blocks, reducing SC-SPD to the original SPD behavior (denoted as w/o SC-SPD).

According to the results in Table 3, we can confirm the effectiveness of the proposed SLR and SC-SPD modules. Each recommended component consistently reduces

{CD}_{ℓ_{1}}

and improves the F-Score. When the SLR module is removed from the encoder,

{CD}_{ℓ_{1}}

increases by 0.22 and the F-Score drops by 0.025. When the SC-SPD module is disabled so that completion is no longer guided by structural priors,

{CD}_{ℓ_{1}}

increases by 1.02 and the F-Score decreases by 0.049. The best performance is achieved when both modules are enabled. These ablation results demonstrate that SLR strengthens feature extraction for partial point clouds, while SC-SPD effectively preserves geometric structures during progressive refinement, leading to completed point clouds that better match real industrial workpieces.

4.6. Inference Time and Model Size

As shown in Table 4, under our unified experimental setting (input point clouds with 2048 sampled points and a fixed hardware platform with an NVIDIA RTX A5000 GPU and an Intel Core i7-12700K CPU), we analyze and compare the inference latency and model size of different methods based on their core architectures and the experimental statistics reported in their public papers. Our model contains 21.2 M parameters, which does not provide a clear advantage in terms of parameter count compared to other approaches. Nevertheless, it achieves an inference time of 18.033 ms, outperforming PMPNet (5.89 M/21.105 ms), SeedFormer (3.20 M/36.736 ms), and ODGNet (11.5 M/48.315 ms) under the same input resolution and hardware configuration.

Overall, these results indicate that parameter count alone is not a reliable proxy for runtime efficiency. Despite having a moderately larger network size, our method maintains lower latency by adopting an efficient coarse-to-fine decoding pipeline and lightweight feature modulation, avoiding expensive global operations during inference. This favorable trade-off suggests that our framework is able to balance geometric fidelity and practical deployment requirements, making it well suited for industrial scenarios where throughput and stable cycle time are critical.

5. Conclusions

Industrial point cloud completion differs substantially from generic object completion by prioritizing geometric engineering correctness, such as preserving regular surfaces, sharp edges, and structural boundaries for downstream tasks. Our observations on CADNET indicate that purely global feature-driven methods are prone to structural drift when handling large missing regions. Motivated by this, our proposed framework explicitly integrates structural constraints and region-aware guidance throughout the completion pipeline, effectively stabilizing industrial geometric consistency under severe incompleteness. Furthermore, by recovering high-fidelity geometric structures, our framework can serve as a robust upstream pre-processor for primitive-centric CAD reconstruction methods like Point2CAD [37], enabling a highly resilient partial-scan-to-CAD pipeline.

From a mechanistic perspective, the effectiveness of our method stems from two complementary designs: (i) strengthened geometric representation and (ii) structure-conditioned progressive refinement. For the former, the SLR encoder does not aim to merely summarize global semantics; instead, it amplifies responses around geometric discontinuities via local graph enhancement, enabling more discriminative local cues near boundaries, hole rims, and sharp edges. Meanwhile, relative-position modeling further improves cross-region relational reasoning, allowing the network to leverage long-range geometric correlations (e.g., symmetry correspondences, overall scale consistency, and relative layouts among key structures) under typical industrial acquisition conditions where only partial regions are visible. As a result, completion is less dependent on heuristic extrapolation from local fragments and becomes closer to a globally consistent inference guided by geometric relations. For the latter, SC-SPD injects structural priors into stage-wise upsampling and displacement correction, transforming refinement from a homogeneous regression process into region-specific geometric adjustment: in regular surface regions, the model tends to suppress unnecessary bending and maintain consistency, whereas in boundary transition regions it tends to preserve sharpness and clarity.

Evaluating industrial completion should go beyond point-wise metrics. Although Chamfer Distance and F-score measure overall discrepancy, they lack sensitivity to engineering requirements such as whether the completed point cloud facilitates high-quality mesh reconstruction. Therefore, we adopt a unified PPSurf configuration for downstream reconstruction visualization to qualitatively assess completion quality. Visual comparisons indicate that point clouds with stable distributions and sharp boundaries yield more structurally faithful surfaces, effectively reducing artifacts such as hole collapse, edge blunting, and local bulging.

From a deployment perspective, our experimental statistics further suggest that model size and inference latency do not follow a simple linear relation. The proposed method achieves competitive inference speed while maintaining high completion accuracy, indicating that structure guidance does not necessarily incur substantial computational overhead.

Despite the effectiveness demonstrated on CADNET, several limitations remain and motivate future research:

(1): While freezing the prior extractor stabilizes training, its reliability may degrade under real-world conditions like severe noise, specular reflections, or domain shift. Future work will focus on developing a more universal, category-agnostic structural decomposition module. By exploring lightweight domain adaptation, uncertainty estimation, and self-supervised learning, we aim to improve the prior’s cross-domain transferability and robustness for broader industrial applications.
(2): While we have validated the model’s robustness against synthetic Gaussian noise, CADNET mainly simulates occlusion-induced missingness, whereas real acquisitions additionally involve more complex factors such as anisotropic noise, outliers, missing returns, and registration errors. Future efforts will incorporate sensor-faithful noise simulation, targeted data augmentation, and evaluations on real scanned datasets and acquisition platforms to systematically assess robustness in practical environments.
(3): Currently, the evaluation of whether the completed point clouds facilitate high-quality mesh reconstruction is limited to visual inspection. Subsequently, we plan to systematically evaluate our completion framework across diverse mesh reconstruction algorithms. This phase will focus on a comprehensive analysis utilizing quantitative mesh metrics—such as Hausdorff distance, normal consistency, and flatness residuals—to establish a holistic, application-level benchmark for industrial reverse engineering.

Author Contributions

Conceptualization, C.Y. and S.Y.; Methodology, C.Y.; Data curation, K.H.; Writing—original draft preparation, C.Y.; Writing—review and editing, J.Z., K.L. and S.Y.; Visualization, C.Y.; Supervision, S.Y.; Project administration, J.Z. and S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Ningbo’s “Science and Technology Yongjiang 2035” Innovation Ecosystem Development Project (Project Number: 2024Z056).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhuang, Z.; Zhi, Z.; Han, T.; Chen, Y.; Chen, J.; Wang, C.; Cheng, M.; Zhang, X.; Qin, N.; Ma, L. A Survey of Point Cloud Completion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 5691–5711. [Google Scholar] [CrossRef]
Tesema, K.W.; Hill, L.; Jones, M.W.; Ahmad, M.I.; Tam, G.K.L. Point Cloud Completion: A Survey. IEEE Trans. Vis. Comput. Graph. 2023, 30, 6880–6899. [Google Scholar] [CrossRef] [PubMed]
Han, Z.; Shang, M.; Liu, Z.; Vong, C.-M.; Liu, Y.-S.; Zwicker, M.; Han, J.; Chen, C.L.P. SeqViews2SeqLabels: Learning 3D Global Features via Aggregating Sequential Views by RNN with Attention. IEEE Trans. Image Process. 2018, 28, 658–672. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Han, Z.; Hong, F.; Liu, Y.-S.; Zwicker, M. LRC-Net: Learning Discriminative Features on Point Clouds by Encoding Local Region Contexts. Comput. Aided Geom. Des. 2020, 79, 101859. [Google Scholar] [CrossRef]
Wen, X.; Han, Z.; Youk, G.; Liu, Y.-S. CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 1661–1669. [Google Scholar]
Yuan, J.; Chen, C.; Yang, W.; Liu, M.; Xia, J.; Liu, S. A Survey of Visual Analytics Techniques for Machine Learning. Comput. Vis. Media 2021, 7, 3–36. [Google Scholar] [CrossRef]
Uddin, K.; Jeong, T.H.; Oh, B.T. Incomplete Region Estimation and Restoration of 3D Point Cloud Human Face Datasets. Sensors 2022, 22, 723. [Google Scholar] [CrossRef] [PubMed]
Chen, H.; Wei, Z.; Xu, Y.; Wei, M.; Wang, J. IMLoveNet: Misaligned Image-Supported Registration Network for Low-Overlap Point Cloud Pairs. In Proceedings of the ACM SIGGRAPH 2022 Conference, Online, 8–12 August 2022; pp. 1–9. [Google Scholar]
Wu, R.; Chen, X.; Zhuang, Y.; Chen, B. Multimodal Shape Completion via Conditional Generative Adversarial Networks. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 281–296. [Google Scholar]
Chang, A.X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q.; Li, Z.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. ShapeNet: An Information-Rich 3D Model Repository. arXiv 2015, arXiv:1512.03012. [Google Scholar]
Manda, B.; Bhaskare, P.; Muthuganapathy, R. A Convolutional Neural Network Approach to the Classification of Engineering Models. IEEE Access 2021, 9, 22711–22723. [Google Scholar] [CrossRef]
Li, S.; Gao, P.; Tan, X.; Wei, M. ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing-Part-Sensitive Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 9466–9475. [Google Scholar]
Wang, J.; Cui, Y.; Guo, D.; Li, J.; Liu, Q.; Shen, C. PointAttn: You Only Need Attention for Point Cloud Completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 5472–5480. [Google Scholar]
Yuan, W.; Khot, T.; Held, D.; Mertz, C.; Hebert, M. PCN: Point Completion Network. In Proceedings of the 2018 International Conference on 3D Vision (3DV); IEEE: New York, NY, USA, 2018; pp. 728–737. [Google Scholar]
Tchapmi, L.P.; Kosaraju, V.; Rezatofighi, H.; Reid, I.; Savarese, S. TopNet: Structural Point Cloud Decoder. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 383–392. [Google Scholar]
Huang, Z.; Yu, Y.; Xu, J.; Ni, F.; Le, X. PF-Net: Point Fractal Network for 3D Point Cloud Completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 7662–7670. [Google Scholar]
Yang, Y.; Feng, C.; Shen, Y.; Tian, D. FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 206–215. [Google Scholar]
Xie, H.; Yao, H.; Zhou, S.; Mao, J.; Zhang, S.; Sun, W. GRNet: Gridding Residual Network for Dense Point Cloud Completion. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 365–381. [Google Scholar]
Zhang, Q.L.; Yang, Y.B. SA-Net: Shuffle Attention for Deep Convolutional Neural Networks. In Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 2235–2239. [Google Scholar]
Xiang, P.; Wen, X.; Liu, Y.S.; Cao, Y.-P.; Wan, P.; Zheng, W.; Han, Z. SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 5499–5509. [Google Scholar]
Wen, X.; Xiang, P.; Han, Z.; Cao, Y.-P.; Wan, P.; Zheng, W.; Liu, Y.-S. PMP-Net: Point Cloud Completion by Learning Multi-Step Point Moving Paths. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 19–25 June 2021; pp. 7443–7452. [Google Scholar]
Tang, J.; Gong, Z.; Yi, R.; Xie, Y.; Ma, L. LAKE-Net: Topology-Aware Point Cloud Completion by Localizing Aligned Keypoints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 1726–1735. [Google Scholar]
Pan, L.; Chen, X.; Cai, Z.; Zhang, J.; Zhao, H.; Yi, S.; Liu, Z. Variational Relational Point Completion Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 19–25 June 2021; pp. 8524–8533. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
Yu, X.; Rao, Y.; Wang, Z.; Liu, Z.; Lu, J.; Zhou, J. PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 12498–12507. [Google Scholar]
Zhou, H.; Cao, Y.; Chu, W.; Zhu, J.; Lu, T.; Tai, Y.; Wang, C. SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 416–432. [Google Scholar]
Chen, Z.; Long, F.; Qiu, Z.; Yao, T.; Zhou, W.; Luo, J.; Mei, T. AnchorFormer: Point Cloud Completion from Discriminative Nodes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 13581–13590. [Google Scholar]
Cai, P.; Scott, D.; Li, X.; Wang, S. Orthogonal Dictionary Guided Shape Completion Network for Point Cloud. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 864–872. [Google Scholar]
Fellbaum, C. WordNet. In Theory and Applications of Ontology: Computer Applications; Springer: Dordrecht, The Netherlands, 2010; pp. 231–243. [Google Scholar]
Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision Meets Robotics: The KITTI Dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
Sharma, G.; Liu, D.; Maji, S.; Kalogerakis, E.; Chaudhuri, S.; Měch, R. ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 261–276. [Google Scholar]
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Adv. Neural Inf. Process. Syst. 2017, 30, 5099–5108. [Google Scholar]
Koch, S.; Matveev, A.; Jiang, Z.; Williams, F.; Artemov, A.; Burnaev, E.; Alexa, M.; Zorin, D.; Panozzo, D. ABC: A Big CAD Model Dataset for Geometric Deep Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 9601–9611. [Google Scholar]
Yu, X.; Rao, Y.; Wang, Z.; Lu, J.; Zhou, J. AdaPoinTr: Diverse Point Cloud Completion with Adaptive Geometry-Aware Transformers. arXiv 2023, arXiv:2301.04545. [Google Scholar] [CrossRef] [PubMed]
Tatarchenko, M.; Richter, S.R.; Ranftl, R.; Li, Z.; Koltun, V.; Brox, T. What Do Single-View 3D Reconstruction Networks Learn? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 3405–3414. [Google Scholar]
Erler, P.; Fuentes-Perez, L.; Hermosilla, P.; Guerrero, P.; Pajarola, R.; Wimmer, M. PPSurf: Combining Patches and Point Convolutions for Detailed Surface Reconstruction. Comput. Graph. Forum 2024, 43, e15000. [Google Scholar] [CrossRef]
Liu, Y.; Obukhov, A.; Wegner, J.D.; Schindler, K. Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 3763–3772. [Google Scholar]

Figure 1. Overall pipeline of the proposed framework, which encodes a partial point cloud into global features, generates coarse seeds, and progressively refines them via SLR-based encoding and SC-SPD decoding with structure priors for structurally consistent point cloud completion.

Figure 2. Illustration of the (LGE) module. Intermediate anchor coordinates and point-wise features are grouped via k-NN to construct local edge features. These local features are aggregated via max pooling, refined by convolutions, and residually fused with the original input features to enhance sensitivity to geometric discontinuities and sharp boundaries.

Figure 3. Illustration of the RelPos-SA module. By encoding relative positional offsets into a learned bias B and adding it to the content-based attention scores, the module fuses semantic and geometric information. The output is residually connected to the input features, enabling the network to capture global structures and long-range dependencies efficiently.

Figure 4. Illustration of the SC-SPD module. ParSeNet-derived structural priors are injected into the upsampling pipeline via FiLM modulation at both the parent and child stages. This explicitly conditions the feature splitting and coordinate offset prediction, ensuring primitive-aware refinement and high structural fidelity in the generated child anchors.

Figure 5. Point cloud visualization. (a) 90_degree_elbows point cloud completion. (b) Bearing_Blocks point cloud completion. (c) Bearing_Like_Parts point cloud completion. (d) Machined_Blocks point cloud completion. (e) L_Blocks point cloud completion.

Figure 6. Detailed qualitative comparison. Red boxes highlight the comparison of edge details.

Figure 7. Mesh Reconstruction visualization. (a) 90_degree_elbows point cloud completion. (b) Bearing_Blocks point cloud completion. (c) Bearing_Like_Parts point cloud completion. (d) Machined_Blocks point cloud completion. (e) L_Blocks point cloud completion.

Figure 8. Qualitative visualization of point cloud completion under varying noise levels. The top row illustrates the input partial point clouds corrupted by synthetic Gaussian noise ranging from 0% (clean) to 3%. The bottom row displays the corresponding complete point clouds generated by our method.

Table 1. Comparison of different methods on two metrics:

{CD}_{ℓ_{1}}

(multiplied by 100) and F-score@1%. Bold indicates the best performance.

Table 1. Comparison of different methods on two metrics:

{CD}_{ℓ_{1}}

(multiplied by 100) and F-score@1%. Bold indicates the best performance.

Method	${CD}_{ℓ_{1}}$	F-Score
PoinTR	5.21	0.241
PMPNet	3.08	0.329
Seedformer	3.92	0.344
ODGNet	3.38	0.336
Ours	2.24	0.443

Table 2. Quantitative comparison of

{CD}_{ℓ_{1}}

under different noise levels. Lower is better, and bold indicates the best performance.

Table 2. Quantitative comparison of

{CD}_{ℓ_{1}}

under different noise levels. Lower is better, and bold indicates the best performance.

Method	Clean (0%)	Noise 1%	Noise 2%	Noise 3%
seedformer	3.92	5.37	7.79	9.84
odgnet	3.38	4.77	7.89	10.72
Ours	2.24	3.85	6.71	8.58

Table 3. Metrics of ablation experiments for comparison.

Ablation Setting	${CD}_{ℓ_{1}}$	F-Score
Base	3.53	0.339
w/o SLR	2.46	0.418
w/o SC-SPD	3.26	0.394
Full	2.24	0.443

Table 4. Complexity analysis.

Method	Param (M)	Times (ms)
PoinTR	30.9 M	12.195 ms
PMPNet	5.89 M	21.105 ms
Seedformer	3.20 M	36.736 ms
ODGNet	11.5 M	48.315 ms
Ours	21.2 M	18.033 ms

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yao, C.; Huang, K.; Lv, K.; Ye, S.; Zhuang, J. Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components. Appl. Sci. 2026, 16, 2713. https://doi.org/10.3390/app16062713

AMA Style

Yao C, Huang K, Lv K, Ye S, Zhuang J. Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components. Applied Sciences. 2026; 16(6):2713. https://doi.org/10.3390/app16062713

Chicago/Turabian Style

Yao, Chendong, Kaixin Huang, Ke Lv, Sichao Ye, and Jiayan Zhuang. 2026. "Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components" Applied Sciences 16, no. 6: 2713. https://doi.org/10.3390/app16062713

APA Style

Yao, C., Huang, K., Lv, K., Ye, S., & Zhuang, J. (2026). Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components. Applied Sciences, 16(6), 2713. https://doi.org/10.3390/app16062713

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Structure-Prior-Guided Point Cloud Completion for Industrial Mechanical Components

Abstract

1. Introduction

2. Related Work

2.1. Encoder–Decoder-Based Point Cloud Completion

2.2. Refinement-Based Point Cloud Completion

2.3. Transformer-Based Point Cloud Completion

2.4. Dataset Introduction

3. Method

3.1. Overall Architecture

3.2. SLR Module

3.3. Coarse Seed Generation

3.4. SC-SPD Module

4. Experiments

4.1. Dataset and Settings

4.2. Quantitative Evaluation

4.3. Qualitative Evaluation

4.4. Robustness to Noise

4.5. Ablation Study

4.6. Inference Time and Model Size

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI