Next Article in Journal
Effect of the Combined Use of Postbiotics and Oxalic Acid Against Varroa destructor Under Field Conditions
Previous Article in Journal
Automatic Correction of Labeling Errors Applied to Tomato Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Reconstruction, Segmentation and Phenotypic Feature Extraction of Oilseed Rape Point Cloud Combining 3D Gaussian Splatting and CKG-PointNet++

1
College of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan 232001, China
2
Ministry of Education, Key Laboratory of Advanced Perception and Intelligent Control of High-End Equipment, Anhui Polytechnic University, Wuhu 241000, China
3
College of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China
*
Author to whom correspondence should be addressed.
Agriculture 2025, 15(12), 1289; https://doi.org/10.3390/agriculture15121289
Submission received: 21 May 2025 / Revised: 8 June 2025 / Accepted: 11 June 2025 / Published: 15 June 2025
(This article belongs to the Section Digital Agriculture)

Abstract

Phenotypic traits and phenotypic extraction at the seedling stage of oilseed rape play a crucial role in assessing oilseed rape growth, breeding new varieties and estimating yield. Manual phenotyping not only consumes a lot of labor and time costs, but even the measurement process can cause structural damage to oilseed rape plants. Existing crop phenotype acquisition methods have limitations in terms of throughput and accuracy, which are difficult to meet the demands of phenotype analysis. We propose an oilseed rape segmentation and phenotyping measurement method based on 3D Gaussian splatting with improved PointNet++. The CKG-PointNet++ network is designed to integrate CGLU and FastKAN convolutional modules in the SA layer, and introduce MogaBlock and a self-attention mechanism in the FP layer to enhance local and global feature extraction. Experiments show that the method achieves a 97.70% overall accuracy (OA) and 96.01% mean intersection over union (mIoU) on the oilseed rape point cloud segmentation task. The extracted phenotypic parameters were highly correlated with manual measurements, with leaf length and width, leaf area and leaf inclination R2 of 0.9843, 0.9632, 0.9806 and 0.8890, and RMSE of 0.1621 cm, 0.1546 cm, 0.6892 cm2 and 2.1144°, respectively. This technique provides a feasible solution for high-throughput and rapid measurement of seedling phenotypes in oilseed rape.

1. Introduction

Rapeseed crop is one of the four major oilseed crops in the world, which is widely used in edible oil as well as fuel production and other fields [1]. The perennial planting area of rapeseed in China is about 100 million mu (approximately 6.67 million hectares), with an annual output of about 14 million tons. Rapeseed oil produced by oilseed rape occupies half of the domestic vegetable oil, and is an important source of vegetable oil and vegetable protein [2]. With the growth of China’s population and the improvement of living standards, the demand for edible oil continues to increase, and the development of the oilseed rape industry is of great significance in meeting the domestic demand for edible oil and reducing import dependence [3,4]. Selecting and breeding high-quality oilseed rape varieties plays an important role in yield improvement of oilseed rape, and the phenotypic extraction technology of oilseed rape has a key reference value for variety improvement and product upgrading [5]. Therefore, the development of a high-throughput, accurate and convenient method to extract phenotypic parameters of oilseed rape is crucial for advancing oilseed rape engineering research and the sustainable development of the industry.
Current methods for obtaining plant phenotypic parameters include manual measurement, 2D image measurement and 3D point cloud measurement [6]. The manual measurement method mainly relies on manual operation, and calipers and measuring rods are commonly used to measure plant phenotypic parameters directly [7], but this method is slow and subjective, and can cause irreversible damage to the plant during the measurement process. The two-dimensional image method is limited to parameter measurements on a two-dimensional plane [8], such as the diameter and surface area of a flowerpot, but it is difficult to capture information such as depth and orientation in a three-dimensional space, such as overlap and occlusion problems between plant leaves.
XIAO et al. [9] proposed a method to utilize 3D point cloud data for soybean plant organ segmentation and phenotypic parameter measurement, obtaining the 3D point cloud of soybean plants through multi-view images, and adopting the normal differential difference algorithm and the improved regional growth algorithm to realize the segmentation of plant organs. YANG et al. [10] proposed to utilize 3D laser scanning technology to obtain the point cloud data of cotton plants at different growth stages, using the random sampling consistent algorithm (RANSAC) combined with a linear model to extract the main stem, and realizing leaf segmentation through regional growth clustering. This method can accurately extract cotton phenotypic parameters and provide technologically relevant support for crop growth monitoring and breeding research. Liang et al. [11] proposed a method for stem and leaf segmentation and phenotypic parameter extraction of tomato seedlings using three-dimensional point cloud data. Seedling point cloud data were collected by a depth camera, and phenotypic parameters were extracted by using the RANSAC algorithm with regional growth algorithms. The method was validated for use in tomato seedling phenotypic measurement potential. Yang et al. [12] proposed a phenotypic measurement system based on three-dimensional reconstruction, acquired point cloud data of wolfberry plants through multi-view images and used a point cloud complementation algorithm to solve the problem of missing leaves, which provided technical support for the growth monitoring and yield prediction of wolfberry. Three-dimensional point cloud extraction methods use some techniques, such as LiDAR laser scanning [13,14], which can provide high-resolution point clouds with real-world dimensions, but the cost is too high and the speed of data acquisition is slow; depth cameras, such as RGB-D cameras [15,16], can quickly reconstruct the three-dimensional data, but their own resolution is limited, and the quality of the generated point clouds is lower; three-dimensional reconstruction based on multi-view images [17] has a high computational complexity, and the noise in the image affects the feature point matching accuracy and thus the quality of 3D reconstruction. Gaussian splatting [18] realizes 3D scene reconstruction by “splashing” each pixel in the 2D image to the corresponding position in the 3D space according to the Gaussian distribution. Compared with traditional methods, it can effectively deal with occlusion and illumination changes, and improve the accuracy and robustness of 3D reconstruction.
In recent years, semantic segmentation based on 3D point cloud has gradually become a hot issue in the research of modern agricultural technology [19,20,21]. Currently there are three main methods for extracting point cloud features with deep learning.
Voxelization-based 3D point cloud processing. Earlier work converted the point cloud into a 3D voxel mesh and used a convolutional neural network (CNN) for feature extraction, and more work targeted to make improvements. Wang et al. [22] proposed an octree-based convolutional neural network O-CNN, which performs CNN computation on sparse voxels occupied by 3D boundaries and effectively reduces the memory occupation and improves the efficiency. Graham et al. [23] introduced the sub-fluid shape sparse convolutional network SSCN, which processes sparse voxels by sub-fluid shaped sparse convolution operators to achieve efficient semantic segmentation of 3D point clouds. Meng et al. [24] proposed the voxel variational self-encoder VV-NET, which utilizes radial basis functions (RBFs) to encode point distributions within the voxels, retaining the regular structure while capturing the detailed data distributions. The voxelization-based approach converts 3D point cloud data into voxel network structures, making it suitable for 3D CNN processing; however, there are many drawbacks to this approach. High-resolution voxelization leads to huge memory requirements, which limits the scalability of the model. The process of voxelization may introduce quantization errors, leading to loss of geometric details in the point cloud and thus affecting the processing accuracy.
3D point cloud processing based on 2D multi-view images. Some approaches process 3D point cloud data via 2D-CNN, first projecting the point cloud into the multi-view with different viewpoints, and then performing feature fusion after feature extraction with a 2D convolution kernel. Wang et al. [25] proposed a deep learning network that combines the texture information of the 2D image and the 3D point cloud to achieve target detection, and distinguishes between the foreground and the background points based on the Hough voting technique, and introduces an inverse mapping layer combining RGB texture features and point cloud geometric features. Zhai et al. [26] proposed a composite feature-based splicing method for the low-overlap human point cloud splicing problem. This method acquires human body point cloud data through a multi-view camera, and completes the accurate splicing of low-overlap point clouds by extracting internal shape features, normal-aligned radial features and significant constraint feature points. However, the multi-view projection process affects the classification and segmentation results due to the loss of geometric detail information in the space, and different combinations of viewpoints may lead to differences in the results, increasing the complexity of model design.
Point-based deep learning. Both of the above two are converting the point cloud to other data forms for processing and then feature extraction, but this will undoubtedly lose the detail information in a 3D space. To overcome the limitations of the above methods, PointNet [27], proposed in 2017, is the first deep learning model to directly process point cloud data, which is groundbreaking, and it achieves segmentation by learning the global features of the point cloud, and achieves the disorder and transformation invariance of the point cloud through maximal pooling with an input transformation network, but has limited ability to capture local spatial information. The subsequent PointNet++ [28] introduces a multilevel structure to address the shortcomings of PointNet, and uses a hierarchical feature extraction and multiscale feature fusion strategy to address the shortcomings in local structure capture and non-uniform sampling problems. Based on this framework a series of subsequent networks for point cloud processing have been proposed. Zhang Qi et al. [29] address the problem that PointNet++ fails to extract Lidar point cloud semantic features in depth in the feature extraction phase and that maximum pooling leads to feature loss in the feature aggregation phase; the researchers improve the feature extraction and feature aggregation modules of PointNet++, and propose a point cloud segmentation model based on the feature bias value and the attention mechanism. Guo et al. [30] combine mechanisms such as offset attention and neighbor embedding into it, combined with the local feature extraction ability of PointNet++, to improve the processing effect on point cloud data. Xu et al. [31] used an RGB-D camera to acquire two-dimensional color and depth images of oilseed rape branches, and removed the angular stalks through image preprocessing to realize the separation of angular fruits from the main stem. Subsequently, the Euclidean clustering algorithm was used to segment and count the number of oilseed rape hornbeams on the fused point cloud data, which provided a new method for the phenotypic analysis of oilseed rape hornbeams. Zhang et al. [32] used binocular stereoscopic vision to acquire three-dimensional point cloud images of oilseed rape leaves at the seedling stage and shoot extraction stage. Noise was removed by combining plane fitting and straight pass filtering, and the leaves were segmented by using the region growing algorithm and LCCP clustering method. The results showed that the method provides feasibility for early extraction of phenotypic parameters in oilseed rape.
As shown in Table 1, although all three methods have obvious advantages in agricultural application scenarios, the point-based deep learning approach is particularly suitable for phenotyping small-scale crops with complex structures such as oilseed rape. However, current research on 3D point-cloud reconstruction and segmentation for rapeseed remains relatively limited, primarily due to three core bottlenecks: (1) the high cost and maintenance requirements of precision hardware—such as LiDAR, laser scanners or multi-view camera rigs—impede large-scale deployment in breeding trials or field inspections; (2) the inherently complex branching and foliage structure of rapeseed plants, combined with challenging acquisition environments, frequently produces point clouds with holes, missing regions or noise, thereby undermining the reliability of subsequent geometric reconstruction and feature extraction; and (3) existing segmentation algorithms struggle to distinguish fine structures (e.g., stems versus leaves) in small, intricate crops, resulting in insufficient accuracy when extracting phenotypic parameters such as leaf area, which in turn compromises breeding decisions and high-throughput screening. In light of these challenges—which collectively restrict the practical application of 3D point-cloud technology in rapeseed phenotyping—this study employs 3D Gaussian splatting for rapeseed reconstruction and integrates an improved CKG-PointNet++ model for point-cloud segmentation, thereby enabling high-throughput, rapid and accurate extraction of rapeseed phenotypic parameters and providing robust technical and data support for subsequent breeding research.

2. Materials and Methods

In this study, we propose a complete process of 3D reconstruction, point cloud segmentation and phenotypic parameter extraction of oilseed rape seedlings based on 3D Gaussian splatting [18] and CKG-PointNet++ technology. The process is mainly divided into three stages.
  • Collecting multi-view image sets of oilseed rape and inputting them into Gaussian Splatting pipeline for processing, and generating Gaussian point clouds representing the 3D morphology of the plants through dense modeling of the scene and Gaussian voxelization.
  • A preprocessing step is performed on the generated point cloud, combining statistical outlier removal (SOR) [33] and radius outlier filter (ROL) [34] to remove noisy and abnormal point clouds, and then using the RANSAC [35] algorithm to segment the target plant point cloud. Based on the preprocessed point cloud, segmentation is performed using the improved CKG-PointNet++ model. The fine-grained geometric and local semantic information in the point cloud is effectively captured at the feature extraction (SA) layer, and the global information is gradually integrated while maintaining the local details. The local information is effectively integrated with the global context in the feature propagation (FP) layer to improve the overall expression capability and segmentation accuracy.
  • Key phenotypic features, such as plant height, leaf length, leaf width, leaf area and leaf inclination, were calculated based on the segmented oilseed rape point cloud. The results are compared and analyzed with the real measured phenotypic parameters to assess the measurement accuracy and reliability of the process.

2.1. Oilseed Rape Data Collection

The oilseed rape image data were collected in a real scene in the field, between 10:00 a.m. and 4:00 p.m., with natural daylight (ambient light of about 60,000–80,000 lux) as the lighting condition. All images were captured using a smartphone (film equivalent focal length 35 mm, resolution 1920 × 1080 p) in manual mode, where the ISO was set to 50–2000 (automatically adjusted but limited to this range) to avoid motion blur. Each oilseed rape seedling was photographed at a distance of 30–60 cm from the plant in both horizontal and overhead views, and the photographing process was performed in approximately 3–4° steps around the plant in a 360° circle, i.e., 80–120 consecutive sequential images were systematically (non-randomly) captured from multiple angles for each plant. All the photographed images of each oilseed rape plant were placed in a separate folder and used in the subsequent 3D reconstruction. In total, image data were collected from over one hundred oilseed rape plants, totaling approximately 11,000 images. A typical shot is shown in Figure 1.

2.2. Oilseed Rape Point Cloud Generation

2.2.1. 3D Reconstruction of Point Clouds

In this paper, we use a 3D Gaussian Splatting method for the 3D reconstruction of oilseed rape crops, which is relatively new compared with traditional 3D reconstruction methods. By utilizing the probability distribution to express the data flexibly during reconstruction, it not only improves the details and quality of the surface reconstruction, but also has advantages in computational efficiency and real-time, which can achieve better results for the reconstruction of the complex scene of oilseed rape.
The hardware environment used for image pose estimation, 3D Gaussian reconstruction, point cloud preprocessing and PointNet++ model training consists of an Intel Core™ i5-13490F CPU (10 cores, 16 threads and base clock 2.5 GHz) and an NVIDIA GeForce RTX 4070 GPU (16 GB VRAM). The operating system is Windows 10 Home. The 3D Gaussian Splatting pipeline was executed under Python 3.8, CUDA 11.6, and PyTorch 1.12.1, whereas the PointNet++–based point cloud segmentation models were run under Python 3.8, CUDA 11.3 and PyTorch 1.12.0.
As shown in Figure 2, colmap [36] is first utilized to estimate the internal and external references of each picture from multi-view images, establish the geometric relationship between each view angle, and recover the initial 3D structure of the scene and camera position information. After obtaining the initial point cloud distribution, each point is modeled as a 3D Gaussian distribution, which includes the position, covariance matrix, color and opacity of the Gaussian distribution. Based on the parameters of the Gaussian distribution, a differentiable rendering function renders the image from a specific viewpoint, which is continuously optimized through back propagation operations. Meanwhile, the adaptive density control mechanism is involved in dynamically adjusting the number and distribution of Gaussian distributions to ensure the accuracy and efficiency of the point cloud model. Finally, after several rounds of iterations, the Gaussian distribution gradually converges to generate a high-quality 3D point cloud model.

2.2.2. Point Cloud Processing

Firstly, the generated point cloud is processed into txt point cloud format containing XYZ direction information, for Gaussian splatting, and although it has been rendered by powerful Gaussian splatting, the generated point cloud model may still have redundant or low-quality points, and the statistical outlier removal (SOR) algorithm, which is based on the statistical distribution dynamically adapts to the local density changes, effectively removes the noise and useless points in oilseed rape, which is suitable for dealing with point clouds of oilseed rape plants with different densities in complex scenes. The radius outlier filter (ROL) algorithm based on K-d tree is used to process the subsequent point cloud, which is able to calculate the number of neighboring points within the specified radius, and determine whether they are outliers and remove them through the threshold, the ROL algorithm is able to identify and remove the outliers in the oilseed rape point cloud effectively. Finally, the ground point cloud is segmented using the RANSAC algorithm, which randomly constructs a planar model in the point cloud, then calculates the distance from the point to the plane, and considers the points with a distance greater than a threshold value as interior points. Within a limited number of iterations, the plane with the most internal points is taken as the final ground model. The point cloud reconstruction of oilseed rape will contain a large number of ground point clouds, and the use of the RANSAC plane detection method can effectively separate the oilseed rape plants from the ground point clouds. The processed point cloud is shown in Figure 3.
Through the combined use of the aforementioned methods, outliers and noise are removed from the rapeseed point clouds. For a single rapeseed point cloud, the integrated processing pipeline—comprising SOR, ROL, and RANSAC algorithms—takes approximately 20–30 s to execute on this platform. Individual rapeseed point clouds contain on average 15,000–20,000 points; the resulting seedling-stage dataset comprises 72 point clouds for training and 11 point clouds for validation, which are used for subsequent segmentation tasks.

2.3. Oilseed Rape Point Cloud Segmentation Model

2.3.1. CKG-PointNet++

In this study, CKG-PointNet++ is used as an improved version of PointNet++ for oilseed rape point cloud segmentation. PointNet++ is a hierarchical network designed on the basis of the original PointNet, and its main goal is to capture the local geometric structure information in the point cloud to make up for the deficiency of PointNet in extracting global features. Its core can be divided into two parts:
  • Feature Extraction (Set Abstraction, SA) Layer. It uses furthest point sampling (FPS) to select a set of representative sampling points from the original point cloud, and uses ball query to construct local regions from the sampling points and nearby points. Local feature extraction is performed in each local region, a multilayer perceptron is used to transform the features of the local points, and the local feature representations are aggregated through the max pooling operation.
By cascading multiple SA layers, the network is able to extract layer-by-layer more advanced local features at different scales and gradually abstract the point cloud information into a global representation.
2
Feature Propagation (FP) Layer: The FP layer samples the low-resolution abstract features back to the original point cloud size, and fuses the fine-grained information from the earlier SA layers mainly through interpolation and cross-layer hopping connections to generate the final feature representation for each point. This is finally fed into the multilayer perceptron (MLP) for segmentation prediction.
Although PointNet++ is much improved in capturing local structural information, it still has some shortcomings. It uses a simple multilayer perceptron structure to extract local features, which has limited ability to capture complex local relationships and fine-grained structures. The use of max pooling to aggregate features for local point sets may lead to the erasure of some useful detail information in the local area, thus affecting the accuracy of the subsequent tasks. Processing the point cloud affected by uneven sampling and noise may affect the final results. As for the FP part, PointNet++ may lose the detail information during the up-sampling process in the recovery process of feature representation.
In order to overcome the limitations of PointNet++ mentioned above, this paper proposes the CKG-PointNet++ model. The CK-SA part combines the CGLU (convolutional gated linear unit) with the pooling mechanism and the feature extraction layer, which makes the local features contain more detailed information, and improves the effect of feature fusion and the robustness. At the same time, FastKAN convolution and an attention mechanism are introduced to capture richer spatial relations in the local area, which solves the limitation of traditional MLP on local feature extraction. G-SA partially integrates the GConv module into the feature extraction layer to address the limitations of relying solely on MLPs in the SA module for local feature extraction and information regulation. This enhancement improves the model’s ability to capture complex local structures and enhances its robustness.
Mk-FP combines MogaBlock with the feature propagation layer to adaptively aggregate features of different scales and viewpoints through the multi-order depth convolution and gating mechanism, which can effectively recover and reconstruct the feature representation of the up-sampling, and solve the problems of the loss of detail information and insufficient expression of local features in the up-sampling process of the original PointNet++. On this basis, k-NN-based edge attention is introduced, which can construct dynamic graphs and introduce learnable weights, and edge attention reduces the influence of point cloud sparsity or noise on features through learnable weights. The introduction of the FP part can improve the quality of the up-sampled features and strengthen the contextual association and geometric perception ability. The model structure is shown in Figure 4.

2.3.2. CK-SA

In the oilseed rape point cloud, the different layers are more complex, and the detail information is difficult to be captured completely. To solve this problem, GLU_Pooling [37] and FastKAN [38] mechanisms are introduced in the SA1 and SA3 layers. The main structure is shown in Figure 5. The input includes the location data xyz and feature data points of the point cloud, the new point cloud feature data are obtained by sampling and grouping, the features are further optimized by the subsequent FastKAN application of attentional convolution and the pooling operation combined with CGLU, and the final features of each sample point are obtained by the global maximal pooling of the features of the neighborhood of the sampled points after the convolutional layer.
FastKAN is a neural network that incorporates radial basis function [38] (RBF) features in a 2D convolutional layer, and the main structure is shown in Figure 6. Firstly, the input feature maps are grouped, and for each grouping, a two-way hybrid computing architecture is used. The feature map in the base path is activated and input into the base convolutional layer, and the output is normalized after convolution, and this operation is used to stabilize the training process. In the spline path, a radial basis function (RBF) is applied to the normalized feature map to map the input feature map into a high-dimensional space, generating a spline basis function, which is convolved by the corresponding spline convolutional layer to capture complex nonlinear features. Finally the outputs of the two paths are summed to fuse the linear and nonlinear features. After completing the feature fusion, the processing results of the group are spliced in the channel dimension to form the final output feature map.
The RBF serves to map the input feature map into a high-dimensional space to generate the spline basis function. Specifically, it generates nums of equally spaced centroids in the interval, calculates the distance of the input tensor x from all the basis centers and finally applies a Gaussian kernel function to generate the basis response. The formula is as follows:
R B F ( x ) = e x p x g r i d d e n o m i n a t o r 2
where grid is predefined grid points and denominator controls the degree of smoothing of RBF. In FastKAN, the RBF outputs are transformed and then unfolded in dimension, the spline convolution kernel learns the mapping of the internal base features in dimension and the final output of the spline convolution is fused with the base path output.
FastKAN adopts a group parallel computing mechanism, which enables different groups to learn differentiated feature patterns and effectively reduces the video memory occupation. Moreover, the group adopts a two-way hybrid computing architecture, where the linear path maintains translation invariance and the spline path enhances the nonlinear expression ability, and the two are complementary and integrated. Through the introduction of the FastKAN convolution mechanism, compared with the original method of using MLP only, it can better reflect the correlation between the points in the neighborhood, solve the problem of insufficient capture of local details in the SA part and maintain the computational volume while realizing finer feature extraction capability.
In order to improve the local detail extraction ability and smoothing information ability when segmenting an oilseed rape point cloud, the CGLU [37] module is incorporated in the feature extraction layer. It combines the gating mechanism with depth-separable convolution to realize efficient feature selection and spatial information fusion. The structure diagram of the module is shown in Figure 7. below. Pooling is adopted as the token mixer to extract spatial features at the cost of low computational cost, in which the average pooling suppresses the high-frequency noise of the point cloud and extracts the local average features, while the subtraction operation preserves the gradient information. The module fuses GLU branches with a joint spatial-channel gating mechanism, which splits the channel into two parts, captures the spatial relationship through convolution and dynamically adjusts it in combination with the channel dimension. Compared with the traditional attention mechanism, its computational complexity is significantly reduced. The depth-separable convolution is a 3 × 3 convolutional kernel that captures spatially localized patterns to compensate for details that may be lost by pooling’s global smoothing. The module uses residual concatenation operations to preserve the original inputs and mitigate the problem of gradient vanishing in deep networks. Combining low-order raw feature information with high-order processing feature information improves the characterization capability.
As can be seen from the figure, CGLU adopts a dual residual architecture, which introduces residual connections independently in the token mixer and GLU branches to synergize low-order spatial features (pooling output) with high-order semantic features (GLU output). The two branches use independent LayerNorm and DropPath to avoid cross-branch interaction and increase the training convergence speed.

2.3.3. G-SA

The GConv [39] module is combined in the SA2 and SA4 sections. The structure is shown in Figure 8. The module accepts the feature map tensor input and normalizes the input features. Subsequently, it is divided into two branches, the tensor passes through the 1 × 1 convolutional layer and depth-separable convolutional layer for local feature extraction and outputs a feature map, feature_map. Secondly, the tensor passes through the 1 × 1 convolutional layer and activation function to generate gating signal gate. The output of the two is multiplied element-by-element and inputted into the 1 × 1 convolutional layer, and finally, the projected feature map is added with the input tensor, which helps to mitigate the deep network’s gradient vanishing problem and enhance feature delivery.
PointNet++ relies on MLP for point features in the local region and ignores the continuity and spatial structure of the points in the neighborhood. The GConv module uses depth-separable convolution in the feature extraction part to better capture the local geometric structure and make the expression of local feature information richer. The module contains a gated branch, gate generate, which utilizes nonlinear activation to adjust the features after convolution. This mechanism ensures the adaptive selection of useful information while suppressing redundant information, which improves the problem of information loss caused by the use of MLP. Normalization is used in the module to incorporate a residual linking mechanism in the output layer, which helps to alleviate the problem of gradient vanishing and enhances the network expressiveness while keeping the computational effort low.

2.3.4. FP-Moga

The feature propagation module is responsible for cross-layer feature fusion and up-sampling in the model, propagating features from low-resolution point clouds to high-resolution point clouds, and splicing deep semantic features with shallow geometric features to achieve multi-scale information fusion. In the complex scene task of processing oilseed rape point clouds, MogaBlock [40] is added for joint spatial-channel modeling to capture multi-scale contextual information.
The structure of MogaBlock is shown in Figure 9, which consists of two branches: spatial attention and channel aggregation. Spatial attention in the spatial aggregation part fuses the extracted multi-scale features by adaptive pooling and residual decomposition of features using a gating mechanism. Channel MLP in the channel aggregation part extends the channels through deep convolution and adds the features after spatial aggregation, and its feature decomposition operation reduces redundant information and improves the efficiency of inter-channel interaction. In order to stabilize training and suppress overfitting, each residual branch introduces learnable Layer_scale factors (L1, L2) to dynamically calibrate the feature amplitude, and combines with double DropPath regularization to decentralize the learning path to improve the model robustness.
The structure of the spatial aggregation module (spatial attention) is shown in Figure 10. The whole consists of feature decomposition, multi-order depth-separable convolution and feed-forward gating to achieve efficient multi-scale feature fusion through multi-order spatial modeling with adaptive gating. In the feature decomposition (Feat_decompose) stage, global average pooling is done on the input features to extract the channel-level statistic x_global, and the discriminative nature of the local features is enhanced by the residual calibration x + σ (x-x_global), where σ is the learnable channel-by-channel scaling factor, ElementScale. Parallel multiorder depth-separable convolution (MultiOrderDWConv) groups the channels to capture local details, mid-range and long-range contexts using depth convolutions with different dilation rates; dynamically gated attention branching generates a spatial-channel adaptive attention mask by 1 × 1 convolution, which is multiplied point-by-point with multiscale features for adaptive feature selection. Finally, the 1 × 1 convolution is used to recover the channel dimension and summed with the residual connection. Compared with the traditional attention mechanism, Moga replaces the high complexity global interaction by multi-order convolution while retaining the multi-scale modeling capability; compared with the static channel weighting, the Moga gating mechanism achieves a finer feature calibration, which is especially suitable for semantic perception tasks in complex scenarios, such as oilseed rape point cloud.
The structure of the channel aggregation module (channel MLP) is shown in Figure 10, which reconstructs the channel information fusion paradigm of the feedforward network through global-local channel interaction. The module first extends the channel to high dimensions using 1 × 1 convolution, then extracts local spatial features using depth separable convolution (dwconv) and introduces GELU activation to enhance the nonlinear representation. In the channel decomposition (Feat_decompose) stage, the high-dimensional features are compressed into single-channel statistics to achieve dynamic modulation of the global context on the local channel response and efficient aggregation of cross-channel information. Joint spatial-channel modeling fuses the spatial structure retained by Dwconv with the global information of channel decomposition, breaking through the limitation of traditional FFN that relies only on linear combinations of neighboring channels, and improving the screening ability of discriminative channels.

2.3.5. FP-Self-Attention

Oilseed rape leaves often shade each other and are tightly arranged, and the smooth interpolation of the original FP leads to the loss of the boundaries of these fine structures, affecting the subsequent feature extraction process and segmentation accuracy. The FP part does not take into account the geometric information such as the relative directions and normals between the points, which makes it difficult to recover the detailed features of the complex local shapes. The inverse distance weighted interpolation based on the Euclidean distances is used in the FP, but in the real scenario the semantic boundary will lead to feature mixing, resulting in boundary blurring and classification error situations. Moreover, the inverse distance interpolation assigns weights only based on geometric distances and cannot evaluate the value of neighboring features to the current point, which makes it difficult to eliminate noise and irrelevant neighboring data.
The self-attention mechanism is introduced within the FP, and the structure is shown in Figure 11. A local k-NN graph is constructed on the interpolated feature map by the knn function to obtain the k nearest neighbor features of each point [41]. Calculate the difference between the centroid features and the neighbor features to get the edge features. Develop the attention branch design for the centroid features and the edge features, respectively, as follows: expand the centroid features after convolution on the self-attention branch. The edge features are convolved on the neighbor’s attention branch. Neighbor weights coefs are obtained by summing the outputs of the two attention branches, and the aggregated features are obtained by weighting and summing the edge features with the coefs. Finally, the aggregated features will be merged with the original feature map to form the residual-enhanced features with attention weighting.
Adding self-attention to the FP part breaks through the single perspective of ‘geometric distance only’, and uses ‘feature-driven dynamic weighting’ to propagate fine-grained information more accurately and robustly, which improves the performance of PointNet++ in segmentation tasks. With the introduction of local k-NN attention, the network can adaptively assign different weights to each neighbor to model the geometrical and semantic associations between points more finely, and the improved FP balances global interpolation with local details. The original inverse distance-weighted interpolation is retained to propagate the coarse-scale global information to the fine-scale, while the attention component enables the final fine-scale features to contain both cross-layer global semantics and rich local structural details. The attention aggregation features are added to the original spliced features to act as residual enhancement, facilitating gradient flow and accelerating the convergence effect.

2.4. Model Training and Performance Evaluation

(1)
Model Training Parameters
The point cloud dataset is split into training and test sets in an 8:2 ratio. The hyperparameters are configured as follows: batch_size is 32, fixed epoch is 32, learning rate uses the default 1 × 10−4, Adam optimiser (PyTorch v1.12.0; Meta Platforms, Inc., Menlo Park, CA, USA) is used and weight decay coefficient is fixed at 0.7.
(2)
Three-dimensional Reconstruction Evaluation
For 3D reconstruction tasks, the model is evaluated using metrics such as the peak signal-to-noise ratio (PSNR), reconstruction loss (Loss) and mean absolute error (L1 Loss). The definitions of these metrics are as follows:
M S E = 1 N i = 1 N I ^ i I i 2
P S N R = 10 l o g 10 M A X I 2 M S E
L L 1 = 1 N i = 1 N I ^ i I i
S S I M ( x , y ) = 2 μ x μ y + C 1 2 σ x y + C 2 μ x 2 + μ y 2 + C 1 σ x 2 + σ y 2 + C 2
L o s s = 1 λ · L L 1 + λ · ( 1 S S I M )
Mean squared error (MSE) is used to measure the overall difference between the predicted results and the ground truth images, emphasizing the global fitting accuracy of the model. PSNR (peak signal-to-noise ratio) is commonly used to evaluate image reconstruction quality; higher values indicate better image fidelity and it often serves as a key metric for model validation. L1 Loss (mean absolute error) represents the average absolute difference between predicted and true values, and is widely applied in image reconstruction and rendering tasks due to its relatively stable convergence. SSIM [42] (structural similarity index) evaluates image similarity from the perspectives of luminance, contrast, and structure, helping to maintain structural consistency during training. Total loss (Loss) is typically formulated as a weighted combination of multiple error terms and serves as the optimization objective during model training, guiding the update of model parameters.
(3)
Semantic Segmentation Evaluation
For the semantic segmentation task of point cloud, the model is evaluated using average recall (AR, avg recall) and F1 score with eval loss. The metrics are defined as follows:
A R = 1 C c = 1 C T P c T P c + F P c
R = T P T P + F N
F 1 = 2 ( P × R ) P + R
P = T P T P + F P
where TP (true positive) denotes the number of samples correctly predicted as positive by the model, i.e., the number of points predicted by the model to be in a certain category and actually in that category, FP (false positive) denotes the number of samples in the negative category that are incorrectly predicted by the model to be positive, i.e., the number of points predicted by the model to be in a certain category but actually not in that category, and C denotes the total number of categories. TPc denotes the number of points in the cth category that are correctly predicted by the model to be in that category, and the same is true for TPc and FPc. FN denotes the number of samples from positive categories that are incorrectly predicted to be in negative categories by the model, i.e., the number of points that are actually in a certain category but the model does not predict them to be in an unchanged category. p denotes the precision corresponding to each category.
The mean recall reflects the proportion of all positive class samples that are correctly predicted as positive by the model, focusing on measuring the model’s ability to identify positive class samples. The F1 score combines the effects of precision and recall while taking into account the accuracy and completeness of the positive class prediction, and takes a value between 0 and 1, with higher values indicating better model performance. The performance of the trained PointNet++ model for point cloud segmentation was evaluated using the overall accuracy (OA, overall accuracy) and mean intersection and union ratio (mIoU). Overall accuracy OA reflects the proportion of overall samples that are correctly predicted by the model and is an intuitive measure of the overall classification accuracy of the model. mIoU (mean intersection over union) measures the model performance by calculating the ratio between the intersection and concatenation of the predicted results and the true labels. It can comprehensively consider the prediction accuracy of the model on each category and reflect the performance of the model more comprehensively. The formula is as follows:
O A = c = 0 C T P c c = 0 C ( T P c + F P c + F N c )
m I o U = 1 C c = 1 C T P c T P c + F P c + F N c
where C denotes the number of categories, TP denotes true cases, i.e., the number of points in the cth category that are correctly predicted to be in that category, FP denotes false positive cases, i.e., the number of points in a non-category that are incorrectly predicted to be in that category by the model, and FN denotes false negative cases, i.e., the number of points that really belong to that category but are predicted by the model to be in some other category.
(4)
Evaluation of phenotypic parameter measurements
Phenotypic parameters were extracted from the point cloud obtained after model segmentation, and then a linear regression analysis was performed between them and the manually measured values. Root mean square error (RMSE), coefficient of determination (R2) and mean absolute percentage error (MAPE) were used as metrics to assess the quantitative accuracy of the extracted phenotypic traits. RMSE was used to estimate the difference between the model segmented phenotypic traits and the manual measurements, while R2 was used to assess the linear relationship between the predicted and the manually measured phenotypic traits, and MAPE measured the average relative error between the predicted values and the actual values. of the mean relative error. The formulae are shown below:
R M S E = i = 1 n ( x i x ^ i ) 2 n
R 2 = 1 i = 1 n ( x i x ^ i ) 2 i = 1 n ( x i x ¯ i ) 2
M A P E = i = 1 n | x i x ^ i | n | x i |
where n denotes the number of plants compared, xi denotes the manual measurements, x ^ i denotes the values of the phenotypic traits after model segmentation and x ¯ i denotes the mean of the manual measurements.

3. Results

3.1. 3D Gaussian Splatting Reconstruction

The dataset is fed into the code for reconstruction. The model reads the data and performs thirty thousand (30,000, hereafter 30k) iterations and outputs the reconstructed three-dimensional (3D) point cloud model after the training is completed. Data comparisons between seven thousand (7000, hereafter 7k) iterations and 30k iterations of training of the model in the current oilseed rape scene are shown in Table 2. As the number of training rounds increases, the loss decreases gradually, the PSNR achieves an increase and the denseness of the point cloud increases. The reconstructed point cloud model is shown in Figure 12.

3.2. Point Cloud Segmentation Results

PointNet is the earliest network model for processing point clouds, which uses a shared MLP to extract the features of each point directly from the point cloud, and uses symmetric functions and permutation invariant design to solve the disorder problem of the point cloud. PointNet++ adopts a hierarchical SA structure on top of PointNet to extract the local features of the cloud. The structure of the improved PointNet++ model used in this study is illustrated in Figure 4. Both PointNet and PointNet++ are classical and widely used models in crop classification, segmentation, and related applications. In this paper, CKG-PointNet++ is compared with PointNet, PointNet++ and PointNet++_msg.
The processed oilseed rape point cloud dataset is fed into the CKG-PointNet++ with the original PointNet++ network. The training results are shown in Table 3. The best mIoU of PointNet after training on the dataset is only 83.60%, the overall accuracy (OA) is 90.73% and the average recall is 90.95%. pointNet++ improves on all the metrics compared to PointNet, with the mIoU reaching 87.15%, the overall accuracy 92.80%, the average recall improved by 2.04% and the evaluation loss was reduced by 23.04%, indicating the advantages of PointNet++ relative to PointNet in complex point cloud scenarios. pointNet++_msg adds the strategy of multi-scale grouping on top of PointNet++, and the mIoU reaches 91.70%, with an overall accuracy rate of 95.54%, and an average recall rate of 95.64%. Compared to PointNet++, the evaluation loss decreased by 152.7%, the overall accuracy increased by 2.74% and the average recall increased by 2.65%. CKG-PointNet++ showed a large improvement compared to all the models mentioned above, with a 12.41% increase in the mIoU compared to PointNet, PointNet++ vs. PointNet_msg, 8.86% and 4.31%, the overall accuracy improved by 6.97%, 4.9% and 2.16% compared to PointNet, PointNet++ and PointNet++_msg., and the evaluation loss is reduced by 360%, 274% and 82.3% compared to PointNet, PointNet++ and PointNet_msg, which is a significant overall performance improvement.
As shown in Table 4, the proposed CKG-PointNet++ significantly outperforms the original PointNet++ in both overall accuracy (OA) and mean intersection over union (mIoU), achieving 97.70% and 96.01%, respectively—representing relative improvements of 4.90% and 8.86%. Notably, CKG-PointNet++ demonstrates superior convergence characteristics: by just the 6th epoch, it surpasses the best OA and mIoU values achieved by the original model after a full 32-epoch training cycle. This performance gain can be attributed to the introduction of the context-aware attention-based sampling module and the multi-scale feature fusion module, which jointly enhance the model’s ability to capture both local and global semantics, thus improving the discriminability of feature representations and accelerating convergence.
Although the improved model’s training time per epoch on the oilseed rape dataset increased from approximately 4 min to 28–30 min, and the peak GPU memory consumption during the full training process (under a batch size of 16 and 1024 points per sample) increased from 1.46 GB (PointNet++) to 5.43 GB (CKG-PointNet++), this computational overhead is justified and acceptable given the significant accuracy gains and accelerated convergence speed. Moreover, during inference, the memory usage remains as low as 1.66 GB, indicating that the model retains good adaptability to resource-constrained environments, making it suitable for deployment on edge devices or in lightweight inference scenarios.
Table 5 presents a module-wise comparison of computational costs. All metrics were collected in evaluation mode (batch_size 1, num_points 1024) using THOP for parameter counts, torch.cuda.max_memory_allocated () for peak memory and CUDA events for average inference latency. Compared to the original SA layer (0.0038 M parameters, 320.66 ms inference and 28.36 MB memory), the improved SA increases parameters to 0.0482 M, inference latency to 333.35 ms and peak memory to 54.52 MB. Similarly, the FP layer’s parameters grow from 0.2637 M to 1.147 M, with inference time rising from 0.84 ms to 2.26 ms and memory from 27.06 MB to 40.00 MB. Despite these moderate overheads, the overall model benefits from significant segmentation gains (see Table 3), demonstrating a favorable trade-off between accuracy and computational cost.
The point cloud segmentation results output from different model testing stages are shown in Figure 13, where (a), (b), (c) and (d) denote the point cloud segmentation results output from PointNet, PointNet++, PointNet++_msg and CKG-PointNet++ testing stages, respectively. The subsequent (a)-1 and (a)-2 denote the detailed presentation of the PointNet point cloud segmentation results, and the same for (b)-1, (b)-2, (c)-1, (c)-2, (d)-1 and (d)-2.
(a)-1 shows the point cloud of the top of oilseed rape, and the white box markers in the figure show the problematic areas of point cloud segmentation, where the main stems and leaves are not fully segmented and the top part is incorrectly segmented into the leaf part. (b)-1 improves the above-mentioned problems, but it does not achieve the desired results, and the top point cloud still has the problem of mis-segmentation. (c)-1 and (d)-1 effectively solve the top segmentation problem, and the segmentation of the blade and main stem is excellent, but (c)-1 has problems in the details, and the segmentation at the blade boundary is unreasonable, while (d)-1 achieves the ideal segmentation effect.
(a)-2 shows the point cloud of the bottom of rape, including the main stem and leaves of rape. As shown in the white box in the figure, (a)-2 incorrectly segments the main stem point cloud into the leaf point cloud, and segments the leaf point cloud into the main stem part, with a slightly worse overall segmentation effect. (b)-2 also suffers from the above problem and does not improve. (c)-2 correctly segments the main stem of oilseed rape in the upper part, but still does not correctly segment the main stem part in the lower part. (d)-2 correctly segments the main stem portion of the oilseed rape compared to the above and achieves the desired segmentation by being more precise and reasonable at the boundaries of the leaf segmentation compared to (c)-2.
To verify the performance of the CKG-PointNet++ model, PointNet++, PointNet++_msg and CKG-PointNet++ models were run on the oilseed rape dataset, respectively, and the batch size was set to 32, the learning rate was 1 × 10−4 and the number of batch point clouds was 1024. The comparison of the model’s metrics change with the number of training rounds is shown in Figure 14.
Figure 14a shows the comparison of the training accuracy of different models; CKG-PointNet++ maintains a small advantage for PointNet_msg after 10 rounds and improves more for PointNet++. Figure 14b shows the training loss comparison; CKG-PointNet++ maintains a lower loss value for CKG-PointNet++ versus PointNet++_msg after convergence. Figure 14c shows the evaluation accuracy comparison, the improved CKG-PointNet++ maintains a substantial improvement in the middle and late stages of training relative to PointNet++ and PointNet++_msg, and compared to the above two models, the CKG-PointNet++ evaluation accuracy change curve is smoother, and the surface improved model has stronger feature fusion and segmentation ability. Figure 14d shows the evaluation loss comparison; CKG-PointNet++ has basically completed the loss convergence after 10 rounds, and the curve is smooth without fluctuation; compared with PointNet++, PointNet++_msg has a stronger generalization ability, and the model has stronger robustness to the noise and outliers of the data.
In order to evaluate the performance of the above improved modules, ablation experiments about the modules are performed for the CKG-PointNet++ model and compared with the original PointNet++ (base) in terms of metrics, as shown in Table 4. The results show that the addition of GLUKAN (the union of CGLU and FastKAN modules) to the feature extraction layer improves the mIoU of the model by 7.64%, and the overall accuracy rises by 4.3%, which proves that GLUKAN can significantly improve the segmentation accuracy of the model. And by adding the Moga and Gconv modules, the model metrics are further improved, indicating that the modules synergize with each other to enhance the model performance. With the introduction of the self-attention mechanism in the feature propagation part, the mIoU and overall accuracy of the model achieve another improvement, and the evaluation loss decreases. After integrating all the improvements, the model achieves the best performance in Table 6, with 8.69% improvement in mIoU, 4.9% improvement in OA, and 274% reduction in evaluation loss compared to Base.

3.3. Calculation of Phenotypic Parameters

3.3.1. Oilseed Rape Point Cloud Processing

The output point cloud model after reconstruction by 3D Gaussian splatting is shown in Figure 15, and a single scene contains a large number of background point clouds as well as invalid point clouds. In this paper, we use the statistical outlier removal (SOR) algorithm to statistically remove isolated and noisy points according to the distance between each point and its neighbors, and then use the radius outlier filter (ROL) based on KD-Tree to further remove the outliers, so that the structure of oilseed rape point cloud can be preserved to a great extent while reducing the number of useless points in the scene. The point cloud structure of oilseed rape was preserved to a great extent while reducing the number of useless points in the scene. Finally, the RANSAC algorithm is used to remove the ground point cloud and complete the segmentation process between the rape point cloud and the ground.
The processed rapeseed point clouds were obtained through the aforementioned filtering and segmentation algorithms, followed by manual refinement to remove redundant points. This resulted in high-quality point clouds suitable for subsequent segmentation and phenotypic analysis. An example of the final point cloud model used for segmentation is shown in Figure 16.
As part of the phenotypic extraction process, a total of 40 rapeseed plant point cloud samples were used for the semantic segmentation task. Each point cloud was reconstructed from multi-view images using the 3D Gaussian splatting method, with approximately 80,000 to 100,000 points per sample.

3.3.2. Calculation of Oilseed Rape Phenotypes

Seedling oilseed rape phenotypic parameters mainly include leaf length and width, leaf area and leaf inclination. In this paper, a number of oilseed rape plants were selected for manual measurement and measurement after division, and the phenotypic parameters obtained from the measurement were compared. The experiment was completed with the help of a straightedge, triangle ruler and other tools for manual measurement.
Due to limitations in measurement equipment, to ensure the reliability of phenotypic data, each leaf of each rapeseed sample was measured three times independently under identical experimental conditions using a digital caliper with an accuracy of ±0.1 cm. For each leaf, we recorded the three measurements of leaf length, leaf width and leaf angle, and calculated the absolute deviation between each measurement and the average value to quantify measurement error. For the 25 samples measured, the average leaf length was 5.75 cm, with an average measurement deviation (average absolute deviation) of 0.30 ± 0.10 cm (i.e., 3 ± 1 mm); the average leaf width was 3.99 cm, the average measurement deviation ± standard deviation was 0.20 ± 0.10 cm (i.e., 2 ± 1 mm) and the average measurement deviation ± standard deviation for leaf angle was 1.5° ± 0.6°. The above results indicate that for the same leaf, the differences between repeated measurements are only 2–4 mm (leaf length), 1–3 mm (leaf width) or less than 2.1° (leaf angle), confirming that the measurement errors from manual measurements are very small compared to the natural variation between different plants, thereby ensuring the reliability of the data.
In this study, segmented point cloud data were imported into CloudCompare to extract key phenotypic parameters including leaf length, leaf width, leaf area and leaf inclination angle. First, plant stems and leaves were separated based on semantic segmentation results to allow independent measurement of individual leaves. As shown in Figure 17, two extremal points on each leaf were manually selected, and the 3D Euclidean distance between them was computed to estimate leaf length and width. Subsequently, each leaf’s surface normal was estimated and a Dip scalar field generated to compute its average inclination angle. Owing to equipment and experimental constraints, an empirical leaf area estimation formula was applied to calibrate point cloud–based measurements against manual annotations, thereby minimizing bias. The procedure was repeated on multiple samples, and average values were reported for phenotypic statistical analysis.
Twenty-five groups of experimental point clouds were selected for phenotypic measurement after point cloud segmentation and manual measurement. As shown in Figure 18, in this paper, the length, width and area of oilseed rape leaves obtained after point cloud reconstruction and segmentation were compared point by point with the manual measurement data, and the respective root mean square error (RMSE), coefficient of determination (R2) and mean absolute percentage error (MAPE) statistical indexes were calculated. (a) The point cloud measurements of leaf length in Figure 18a were highly consistent with the manual (R2 = 0.9843, RMSE = 0.1621 cm and MAPE = 2.58%), and the data indicated that the leaf length extracted from the point cloud had a low error and could accurately reflect the true leaf length. The leaf width in Figure 18b was compared with manual measurement (R2 = 0.9632, RMSE = 0.1546 cm and MAPE = 3.17%), which indicated that the parameters measured by the 3D point cloud in this experiment showed a good linear correlation with manual measurement. Because of the limited equipment conditions, the leaf area in Figure 18c was calculated using the equation approximation; this paper refers to the experiments of leaf area estimation equation of oilseed rape carried out by GA Dalmago and so on [43], and the leaf area calculation equation proposed by Cargnelutti Filho A and so on [44] was selected as the basis of the experiment, and the point cloud measurements were unified with the manually measured values, i.e.,
S = 0.7425 × ( L × W ) 0.9167
where S is the calculated leaf area, L is the leaf length and W is the leaf width. After calculation (R2 = 0.9806, RMSE = 0.6892 cm2, MAPE = 4.65%), the leaf area data extracted from the point cloud parameters in the experiment more accurately reflect the real leaf area data of oilseed rape and show good robustness. The leaf inclination in Figure 18d was compared with the manually measured leaf inclination (R2 = 0.8890, RMSE = 2.1144°, MAPE = 3.74%), which indicated that there was a high degree of consistency between the leaf inclination extracted by the point cloud and the manually measured values, and met the accuracy requirements of high-throughput phenotyping.

4. Discussion

3D Gaussian splatting-based reconstruction is used in this study. The traditional SfM-MVS pipeline generates high-precision point clouds through feature matching and dense multi-view reconstruction but imposes stringent requirements on image quality and lighting while consuming excessive computational resources. For example, on a workstation equipped with an Intel Core i5-13490F processor, an NVIDIA GeForce RTX 4070 GPU (16 GB VRAM), and 32 GB of RAM (Windows 10, PyTorch 1.12.1, CUDA 11.6), SfM-MVS reconstruction of a single rapeseed plant takes over 60 min. LiDAR technology can provide even higher-precision 3D data, but it is constrained by a high equipment cost and large form factor. In contrast, the 3D Gaussian splatting method only requires image sets captured by cell phones or handheld cameras. It employs a differentiable rendering pipeline to optimize scene representation and achieve continuous point cloud reconstruction, significantly reducing memory footprint and computational complexity. In our experiments on the same workstation, each rapeseed plant (80–120 input images) can be reconstructed in approximately 20–25 min, with peak GPU memory usage of around 6~8 GB. Furthermore, as a neural rendering-based approach, 3D GS achieves superior reconstruction quality in terms of rendering effects and detail preservation—particularly in low-texture regions—compared to traditional methods. By delivering comparable or better reconstruction accuracy with much lower runtime and hardware demands, 3D Gaussian splatting outperforms traditional techniques in quality, efficiency and hardware adaptability, providing a feasible route for developing automated 3D reconstruction and phenotype-acquisition platforms.
Traditional point-cloud-based geometric segmentation methods (such as RANSAC fitting or K-means clustering) rely heavily on heuristic geometric rules and perform poorly when plant structures are complex or data quality is unstable. In our experiments, RANSAC and K-means applied to rapeseed for binary segmentation achieved mean IoU values of 0.33 and 0.26, with overall accuracies of 63.94% and 43.26%, respectively—insufficient for the precision and reliability required in plant phenotypic extraction.
In contrast, when extended to a multi-class semantic segmentation task, the proposed CKG-PointNet++ model—which incorporates feature fusion enhancement in the set abstraction (SA) stage and introduces a self-attention mechanism in the feature propagation (FP) stage—achieved a mean IoU of 96.01% and an overall accuracy of 97.70%, as shown in Table 3, representing a significant improvement over RANSAC and K-means. The reasons for this performance gain are twofold: first, the improved SA fusion allows the network to capture fine local structures (e.g., petioles) more accurately; second, the self-attention in FP suppresses noise and sharpens boundaries during up-sampling. Moreover, although CKG-PointNet++ incurs a higher computational cost during training, its inference-phase hardware requirements are comparable to those of the original PointNet++, fully satisfying the practical needs of a high-throughput phenotyping platform. Compared with traditional point cloud segmentation methods that involve high memory consumption, the proposed model significantly reduces hardware requirements and demonstrates stronger robustness to variations in point cloud density and noise. This makes it more suitable for automated segmentation and phenotypic trait extraction workflows. As described in Section 2.2.1, all experiments were conducted on a workstation equipped with an Intel Core i5-13490F CPU, an NVIDIA GeForce RTX 4070 GPU (16 GB VRAM), and 32 GB of RAM, running Windows 10 with PyTorch 1.12.0 and CUDA 11.3. In our experiments, point clouds were fed into the network with a batch size of 32 and 1024 points per input. Under this setup, CKG-PointNet++ requires approximately 515 ms for a single inference, which is only around 100 ms more than the original PointNet++. As shown in Table 4, the GPU memory consumption increases only to approximately 1.66 GB, roughly twice the 0.82 GB of the baseline model.
In addition, the point cloud measurements obtained in this study showed strong linear consistency with manual measurements (e.g., leaf length R2 = 0.9843, leaf width R2 = 0.9632) and a low mean absolute error (less than 2 mm), fully demonstrating the accuracy and practical utility of the proposed reconstruction and segmentation process for seedling phenotyping in oilseed rape.
Furthermore, in terms of measurement efficiency, traditional manual phenotyping of rapeseed seedlings—using rulers, calipers or other manual annotation tools—takes approximately 5–10 min per plant (5–8 leaves), depending on operating experience and leaf complexity. In contrast, our pipeline first reconstructs and segments a dense 3D point cloud via Gaussian splatting and CKG-PointNet++, and then performs semi-automatic trait computation in CloudCompare, reducing the per-plant measurement time to around 2–3 min. This time reduction not only saves labor but also improves consistency and reproducibility, highlighting the practical value of our method for high-throughput phenotyping applications.

5. Conclusions

In this study, we propose a method based on CKG-PointNet++ for 3D reconstruction and segmentation of oilseed rape seedlings and phenotypic parameter measurement. The whole process includes multi-view image acquisition, 3D reconstruction, point cloud preprocessing, point cloud segmentation and phenotypic parameter extraction. The 3D Gaussian splatting algorithm is able to restore the geometry of oilseed rape plants at the detail level. Statistical outlier filtering, radius filtering algorithm and RANSAC algorithm are used to remove the noise points and separate the target point cloud. Noisy points are handled easily and efficiently, while the structure of the target point cloud is preserved to a large extent. The improved CKG-PointNet++ performs well in point cloud segmentation, with an overall accuracy of 97.70% and mIoU of 96.01%. The leaf length, leaf width, leaf area and leaf inclination phenotypic parameters were extracted from the segmented oilseed rape seedling point cloud, and the goodness-of-fit R2 was 0.9843, 0.9632, 0.9806 and 0.8890, the root-mean-square error RMSE was 0.1621 cm, 0.1546 cm, 0.6892 cm2 and 2.1144°, and the mean absolute percentage error MAPE was 2.58%, 3.17%, 4.65% and 3.74%, respectively. The results showed that the present study was able to obtain the key phenotypic parameters of oilseed rape at the seedling stage quickly and stably with high precision.
Future research will focus on the integration of 3D reconstruction and point cloud segmentation techniques to enhance the ability of 3D plant characterization and automated extraction of phenotypic parameters, in order to support the screening of oilseed rape varieties, phenotype-genotype analysis and subsequent applications in related fields.

Author Contributions

Conceptualization, J.P. and J.S.; methodology, J.P. and J.S.; software, J.P.; validation, J.P.; formal analysis, J.P. and T.H.; investigation, J.P.; resources, J.S., T.H. and Y.H.; data curation, J.P. and J.S.; writing—original draft preparation, J.P.; writing—review and editing, J.P., S.H., S.Y., T.H. and Y.H.; visualization, J.P. and T.H.; supervision, T.H. and Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Anhui Province University Collaborative Innovation Research Program (GXXT-2023-068) and the Research on Autonomous Navigation and Collaborative Control of Agricultural Machinery Equipment (ALW2021YF03).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ongoing research projects.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Fang, Z.; Liu, P. The source of capacity improvement of China’s three major oilseed crops. Chin. J. Oilseed Crops 2025, 47, 243–259. [Google Scholar]
  2. Fu, D.-H.; Jiang, L.-Y.; Mason, A.S.; Xiao, M.-L.; Zhu, L.-R.; Li, L.-Z.; Zhou, Q.-H.; Shen, C.-J.; Huang, C.-H. Research progress. J. Integr. Agric. 2016, 15, 1673–1684. [Google Scholar] [CrossRef]
  3. Wang, H. Development strategy of oilseed rape industry oriented to new demand. Chin. J. Oilseed Crops 2018, 40, 613. [Google Scholar]
  4. Hu, Q.; Hua, W.; Yin, Y.; Zhang, X.; Liu, L.; Shi, J.; Zhao, Y.; Qin, L.; Chen, C.; Wang, H. Rapeseed research and production in China. Crop J. 2017, 5, 127–135. [Google Scholar] [CrossRef]
  5. Li, H.; Feng, H.; Guo, C.; Yang, S.; Huang, W.; Xiong, X.; Liu, J.; Chen, G.; Liu, Q.; Xiong, L.; et al. High-throughput phenotyping accelerates the dissection of the dynamic genetic architecture of plant growth and yield improvement in rapeseed. Plant Biotechnol. J. 2020, 18, 2345–2353. [Google Scholar] [CrossRef]
  6. Zhang, H.; Wang, L.; Jin, X.; Bian, L.; Ge, Y. High-throughput phenotyping of plant leaf morphological, physiological, and biochemical traits on multiple scales using optical sensing. Crop J. 2023, 11, 1303–1318. [Google Scholar] [CrossRef]
  7. Li, Y.; Wen, W.; Miao, T.; Wu, S.; Yu, Z.; Wang, X.; Guo, X.; Zhao, C. Automatic organ-level point cloud segmentation of maize shoots by integrating high-throughput data acquisition and deep learning. Comput. Electron. Agric. 2022, 193, 106702. [Google Scholar] [CrossRef]
  8. Guo, W.; Zhou, C.; Han, W. Rapid and non-destructive measurement system of plant leaf area based on Android mobile phone. J. Agric. Mach. 2014, 45, 275–280. [Google Scholar]
  9. Xiao, Y.; Liu, S.; Hou, C.; Liu, Q.; Li, F.; Zhang, W. Organ Segmentation and Phenotypic Analysis of Soybean Plants Based on Three-dimensional Point Clouds. J. Agric. Sci. Technol. 2023, 25, 115–125. [Google Scholar]
  10. Yang, X.; Hu, S.; Wang, Y.; Yang, W.; Zhai, R. Cotton Phenotypic Trait Extraction Using Multi-Temporal Laser Point Clouds. Smart Agric. 2021, 3, 51–62. [Google Scholar]
  11. Liang, X.; Yu, W.; Qin, L.; Wang, J.; Jia, P.; Liu, Q.; Lei, X.; Yang, M. Stem and Leaf Segmentation and Phenotypic Parameter Extraction of Tomato Seedlings Based on 3D Point. Agronomy 2025, 15, 120. [Google Scholar] [CrossRef]
  12. Yang, Z. Research and Implementation of a Phenotypic Measurement System for Wolfberry Plants Based on Three-Dimensional Reconstruction. Master’s Thesis, Ningxia University, Ningxia, China, 2022. [Google Scholar]
  13. Xu, Q.; Cao, L.; Xue, L.; Chen, B.; An, F.; Yun, T. Extraction of Leaf Biophysical Attributes Based on a Computer Graphic-based Algorithm Using Terrestrial Laser Scanning Data. Remote Sens. 2019, 11, 15. [Google Scholar] [CrossRef]
  14. Thapa, S.; Zhu, F.; Walia, H.; Yu, H.; Ge, Y. A Novel LiDAR-Based Instrument for High-Throughput, 3D Measurement of Morphological Traits in Maize and Sorghum. Sensors 2018, 18, 1187. [Google Scholar] [CrossRef] [PubMed]
  15. Yau, W.K.; Ng, O.-E.; Lee, S.W. Portable device for contactless, non-destructive and in situ outdoor individual leaf area measurement. Comput. Electron. Agric. 2021, 187, 106278. [Google Scholar] [CrossRef]
  16. Li, Y.; Wen, W.; Fan, J.; Gou, W.; Gu, S.; Lu, X.; Yu, Z.; Wang, X.; Guo, X. Multi-source data fusion improves time-series phenotype accuracy in maize under a field high-throughput phenotyping platform. Plant Phenomics 2023, 5, 0043. [Google Scholar] [CrossRef]
  17. Xu, X.; Li, J.; Zhou, J.; Feng, P.; Yu, H.; Ma, Y. Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo. Agriculture 2025, 15, 298. [Google Scholar] [CrossRef]
  18. Kerbl, B.; Kopanas, G.; Leimkühler, T.; Drettakis, G. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph. 2023, 42, 1–14. [Google Scholar] [CrossRef]
  19. Chen, Y.; Xiong, Y.; Zhang, B.; Zhou, J.; Zhang, Q. 3D point cloud semantic segmentation toward large-scale unstructured agricultural scene classification. Comput. Electron. Agric. 2021, 190, 106445. [Google Scholar] [CrossRef]
  20. Cui, D.; Liu, P.; Liu, Y.; Zhao, Z.; Feng, J. Automated Phenotypic Analysis of Mature Soybean Using Multi-View Stereo 3D Reconstruction and Point Cloud Segmentation. Agriculture 2025, 15, 175. [Google Scholar] [CrossRef]
  21. Guo, R.; Xie, J.; Zhu, J.; Cheng, R.; Zhang, Y.; Zhang, X.; Gong, X.; Zhang, R.; Wang, H.; Meng, F. Improved 3D point cloud segmentation for accurate phenotypic analysis of cabbage plants using deep learning and clustering algorithms. Comput. Electron. Agric. 2023, 211, 108014. [Google Scholar] [CrossRef]
  22. Wang, P.S.; Liu, Y.; Guo, Y.X.; Sun, C.Y.; Tong, X. O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Trans. Graph. TOG 2017, 36, 72. [Google Scholar] [CrossRef]
  23. Graham, B.; Engelcke, M.; Van Der Maaten, L. 3d semantic segmentation with submanifold sparse convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9224–9232. [Google Scholar]
  24. Meng, H.Y.; Gao, L.; Lai, Y.K.; Manocha, D. Vv-net: Voxel vae net with group convolutions for point cloud segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8500–8508. [Google Scholar]
  25. Wang, Z.Y.; Sun, H.Y.; Sun, X.P. Survey on Large Scale 3D Point Cloud Processing Using Deep Learning. Comput. Syst. Appl. 2023, 32, 1–12. [Google Scholar] [CrossRef]
  26. Zhai, Y. Research on 3D Point Cloud Stitching Method Based on Multi-View Camera. Master’s Thesis, Tianjin University of Technology, Tianjin, China, 2022. [Google Scholar]
  27. Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
  28. Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. arXiv 2017, arXiv:1706.02413. [Google Scholar]
  29. Zhang, C.; Wang, Z.; Wu, H.; Chen, D. Lidar Point Cloud Segmentation Model Based on Improved PointNet++. Laser Optoelectron. Prog. 2024, 61, 0411001. [Google Scholar]
  30. Guo, M.H.; Cai, J.X.; Liu, Z.N.; Mu, T.J.; Martin, R.R.; Hu, S.M. Pct: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
  31. Xu, S.; Lu, K.; Pan, L.; Liu, T.; Zhou, Y.; Wang, B. 3D Reconstruction of Rape Branch and Pod Recognition Based on RGB-D Camera. Trans. Chin. Soc. Agric. Mach. 2019, 50, 21–27. [Google Scholar]
  32. Zhang, L.; Shi, S.; Zain, M.; Sun, B.; Han, D.; Sun, C. Evaluation of Rapeseed Leave Segmentation Accuracy Using Binocular Stereo Vision 3D Point Clouds. Agronomy 2025, 15, 245. [Google Scholar] [CrossRef]
  33. Rusu, R.B. Semantic 3D object maps for everyday manipulation in human living environments. KI-Künstliche Intell. 2010, 24, 345–348. [Google Scholar] [CrossRef]
  34. Narváez, E.A.L.; Narváez, N.E.L. Point cloud denoising using robust principal component analysis. In Proceedings of the International Conference on Computer Graphics Theory and Applications, Setúbal, Portugal, 25–28 February 2006; SCITEPRESS: Setúbal, Portugal, 2006; Volume 2, pp. 51–58. [Google Scholar]
  35. Derpanis, K.G. Overview of the RANSAC Algorithm. Image 2010, 4, 2–3. [Google Scholar]
  36. Schönberger, J.L.; Frahm, J.-M. Structure-from-Motion Revisited. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 4104–4113. [Google Scholar]
  37. Shi, D. Transnext: Robust foveal visual perception for vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 17773–17783. [Google Scholar]
  38. Li, Z. Kolmogorov-arnold networks are radial basis function networks. arXiv 2024, arXiv:2405.06721. [Google Scholar]
  39. Song, Y.; Zhou, Y.; Qian, H.; Du, X. Rethinking performance gains in image dehazing networks. arXiv 2022, arXiv:2209.11448. [Google Scholar]
  40. Li, S.; Wang, Z.; Liu, Z.; Tan, C.; Lin, H.; Wu, D.; Chen, Z.; Zheng, J.; Li, S.Z. Moganet: Multi-order gated aggregation network. arXiv 2022, arXiv:2211.03295. [Google Scholar]
  41. Wang, P.; Wang, X.; Wang, F.; Lin, M.; Chang, S.; Li, H.; Jin, R. Kvt: K-nn attention for boosting vision transformers. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2022; pp. 285–302. [Google Scholar]
  42. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
  43. Dalmago, G.A.; Bianchi, C.A.M.; Kovaleski, S.; Fochesatto, E. Evaluation of mathematical equations for estimating leaf area in rapeseed. Rev. Cienc. Agron. 2019, 50, 420–430. [Google Scholar] [CrossRef]
  44. Cargnelutti Filho, A.; Toebe, M.; Alves, B.M.; Burin, C.; Kleinpaul, J.A. Estimação da área foliar de canola por dimensões foliares. Bragantia 2015, 74, 139–148. [Google Scholar] [CrossRef]
Figure 1. Example of a multi-view image of oilseed rape taken in the field.
Figure 1. Example of a multi-view image of oilseed rape taken in the field.
Agriculture 15 01289 g001
Figure 2. Example of colmap calculating camera pose. Small red triangular frusta indicate estimated camera centers and viewing directions. The three coloured axes represent the global right-handed coordinate system used by COLMAP (red indicates the +X axis, green indicates the +Y axis, and blue indicates the +Z axis).
Figure 2. Example of colmap calculating camera pose. Small red triangular frusta indicate estimated camera centers and viewing directions. The three coloured axes represent the global right-handed coordinate system used by COLMAP (red indicates the +X axis, green indicates the +Y axis, and blue indicates the +Z axis).
Agriculture 15 01289 g002
Figure 3. Example of a point cloud of treated oilseed rape.
Figure 3. Example of a point cloud of treated oilseed rape.
Agriculture 15 01289 g003
Figure 4. CKG-PointNet++ structure diagram.
Figure 4. CKG-PointNet++ structure diagram.
Agriculture 15 01289 g004
Figure 5. Improvement of the structure of the SA section.
Figure 5. Improvement of the structure of the SA section.
Agriculture 15 01289 g005
Figure 6. FastKAN module structure diagram. The black dots indicate multiple identical parallel structures that are omitted from the figure for clarity.
Figure 6. FastKAN module structure diagram. The black dots indicate multiple identical parallel structures that are omitted from the figure for clarity.
Agriculture 15 01289 g006
Figure 7. CGLU module structure diagram.
Figure 7. CGLU module structure diagram.
Agriculture 15 01289 g007
Figure 8. Gconv module structure diagram.
Figure 8. Gconv module structure diagram.
Agriculture 15 01289 g008
Figure 9. MogaBlock module structure diagram.
Figure 9. MogaBlock module structure diagram.
Agriculture 15 01289 g009
Figure 10. MogaBlock module detail structure diagram.
Figure 10. MogaBlock module detail structure diagram.
Agriculture 15 01289 g010
Figure 11. k-NN Attention detail structure diagram.
Figure 11. k-NN Attention detail structure diagram.
Agriculture 15 01289 g011
Figure 12. Gaussian Splatting point cloud reconstruction results (a) for the point cloud model reconstructed with 7k times of model training and (b) for the point cloud model reconstructed with 30k times of model training.
Figure 12. Gaussian Splatting point cloud reconstruction results (a) for the point cloud model reconstructed with 7k times of model training and (b) for the point cloud model reconstructed with 30k times of model training.
Agriculture 15 01289 g012
Figure 13. Tested oilseed rape point cloud segmentation results under different models. (a) Test results after PointNet model training. (b) Test results after PointNet++ model training. (c) Test results after PointNet_msg model training. (d) Test results after CKG-PointNet++ model training; (a)-1, (a)-2 are the results of PointNet segmentation on the same point cloud. (a)-1, (a)-2 are the segmentation results of PointNet on the same point cloud, (b)-1, (b)-2, (c)-1, (c)-2, (d)-1, (d)-2, and so on. Different colors are assigned to the leaves and stems according to the segmentation results to aid in visual distinction.
Figure 13. Tested oilseed rape point cloud segmentation results under different models. (a) Test results after PointNet model training. (b) Test results after PointNet++ model training. (c) Test results after PointNet_msg model training. (d) Test results after CKG-PointNet++ model training; (a)-1, (a)-2 are the results of PointNet segmentation on the same point cloud. (a)-1, (a)-2 are the segmentation results of PointNet on the same point cloud, (b)-1, (b)-2, (c)-1, (c)-2, (d)-1, (d)-2, and so on. Different colors are assigned to the leaves and stems according to the segmentation results to aid in visual distinction.
Agriculture 15 01289 g013
Figure 14. Comparison of changes in different model metrics with the number of training rounds. (a) Comparison of training accuracy, (b) training loss comparison, (c) evaluation accuracy comparison and (d) Evaluation loss comparison.
Figure 14. Comparison of changes in different model metrics with the number of training rounds. (a) Comparison of training accuracy, (b) training loss comparison, (c) evaluation accuracy comparison and (d) Evaluation loss comparison.
Agriculture 15 01289 g014
Figure 15. Point cloud processing. (a) 3D Gaussian Splatting to reconstruct the point cloud. (b) SOR algorithm processed point cloud. (c) ROL algorithm processing after point cloud. (d) RANSAC segmentation of the pre-point cloud. (e) Splitting inner points. (f) RANSAC segmented point cloud.
Figure 15. Point cloud processing. (a) 3D Gaussian Splatting to reconstruct the point cloud. (b) SOR algorithm processed point cloud. (c) ROL algorithm processing after point cloud. (d) RANSAC segmentation of the pre-point cloud. (e) Splitting inner points. (f) RANSAC segmented point cloud.
Agriculture 15 01289 g015
Figure 16. Processed oilseed rape point cloud.
Figure 16. Processed oilseed rape point cloud.
Agriculture 15 01289 g016
Figure 17. Schematic diagram of leaf length and width measurement in CloudCompare. Δx, Δy, and Δz represent the differences between two points along the X, Y, and Z axes, respectively; Δxy, Δxz, and Δyz represent the projected distances between the two points on the XY, XZ, and YZ planes, respectively.
Figure 17. Schematic diagram of leaf length and width measurement in CloudCompare. Δx, Δy, and Δz represent the differences between two points along the X, Y, and Z axes, respectively; Δxy, Δxz, and Δyz represent the projected distances between the two points on the XY, XZ, and YZ planes, respectively.
Agriculture 15 01289 g017
Figure 18. Comparison of point cloud measurements with manually measured phenotypic parameters of oilseed rape. (a) Comparison of leaf length measurements. (b) Comparison of leaf width measurements. (c) Comparison of leaf area measurements. (d) Comparison of leaf inclination measurements.
Figure 18. Comparison of point cloud measurements with manually measured phenotypic parameters of oilseed rape. (a) Comparison of leaf length measurements. (b) Comparison of leaf width measurements. (c) Comparison of leaf area measurements. (d) Comparison of leaf inclination measurements.
Agriculture 15 01289 g018
Table 1. Comparison of different 3D point cloud processing methods in agricultural scenarios.
Table 1. Comparison of different 3D point cloud processing methods in agricultural scenarios.
MethodsAdvantagesDisadvantagesApplicability in Agriculture
Voxel-based methodsWell-structured data and more efficient processingPresence of voxel quantization errors and loss of detail due to resolution limitationsFor modeling the overall morphology of large, structured crops, such as fruit trees
2D projection-based methodsCan use 2D convolutional networks with low computational costSignificant loss of information in 3D space makes it difficult to capture complex crop structuresSuitable for crop identification in relatively flat areas, such as fields
Point-based learning methodsDirectly process raw point clouds, preserve geometric structure and have a high accuracySensitive to point count and high computational costSuitable for fine-grained segmentation of irregular crops or weeds
Table 2. Reconstruction Results of Gaussian Splatting in Oilseed Rape Scenes.
Table 2. Reconstruction Results of Gaussian Splatting in Oilseed Rape Scenes.
ModelITERPoint NumberLossL1 PSNR
Gaussian Splatting70002,638,2720.1340.068520.35
Gaussian Splatting30,0003,012,9450.0820.054322.06
Table 3. Results of each model trained on an independent dataset.
Table 3. Results of each model trained on an independent dataset.
ModelAR (%)Eval LossF1 ScoreOA (%)mIoU (%)
PointNet90.950.2350.91190.7383.60
PointNet++92.990.1910.93192.8087.15
PointNet_msg95.640.0930.95795.5491.70
CKG-PointNet++98.190.0510.97197.7096.01
Table 4. Comparison of training efficiency and computational performance between the original and improved model.
Table 4. Comparison of training efficiency and computational performance between the original and improved model.
ModelOA (%)mIoU (%)Epoch Training TimeInference Memory (GB)Peak Training Memory (GB)Convergence Epochs
PointNet++92.8087.15~4 min0.82 GB1.46 GB>20 epochs
CKG-PointNet++97.7096.0128 ~ 30 min1.66 GB5.43 GB>5 epochs
Table 5. Module-wise performance comparison.
Table 5. Module-wise performance comparison.
ModuleParams (M)Inference Time (Ms/Sample)Peak CUDA Memory (MB)
SA (Original)0.0038320.6628.36
SA (Improved)0.0482333.3554.52
FP (Original)0.26370.8427.06
FP (Improved)1.14702.2640.00
Table 6. CKG-PointNet++ ablation experiments on independent datasets.
Table 6. CKG-PointNet++ ablation experiments on independent datasets.
ModelAR (%)Eval LossF1 ScoreOA (%)mIoU (%)
Base92.990.1910.90092.8087.15
Base+GLUKAN97.550.0640.97397.1094.79
Base+GLUKAN+Moga97.770.0620.97597.2695.13
Base+GLUKAN+Moga+Gconv97.740.0570.97697.3595.26
Base+GLUKAN+Moga+Gconv+Self Attention98.190.0510.97197.7096.01
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, Y.; Pang, J.; Yu, S.; Su, J.; Hou, S.; Han, T. Reconstruction, Segmentation and Phenotypic Feature Extraction of Oilseed Rape Point Cloud Combining 3D Gaussian Splatting and CKG-PointNet++. Agriculture 2025, 15, 1289. https://doi.org/10.3390/agriculture15121289

AMA Style

Huang Y, Pang J, Yu S, Su J, Hou S, Han T. Reconstruction, Segmentation and Phenotypic Feature Extraction of Oilseed Rape Point Cloud Combining 3D Gaussian Splatting and CKG-PointNet++. Agriculture. 2025; 15(12):1289. https://doi.org/10.3390/agriculture15121289

Chicago/Turabian Style

Huang, Yourui, Jiale Pang, Shuaishuai Yu, Jing Su, Shuainan Hou, and Tao Han. 2025. "Reconstruction, Segmentation and Phenotypic Feature Extraction of Oilseed Rape Point Cloud Combining 3D Gaussian Splatting and CKG-PointNet++" Agriculture 15, no. 12: 1289. https://doi.org/10.3390/agriculture15121289

APA Style

Huang, Y., Pang, J., Yu, S., Su, J., Hou, S., & Han, T. (2025). Reconstruction, Segmentation and Phenotypic Feature Extraction of Oilseed Rape Point Cloud Combining 3D Gaussian Splatting and CKG-PointNet++. Agriculture, 15(12), 1289. https://doi.org/10.3390/agriculture15121289

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop