Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting

Zhang, Fei; Wang, Yinghui; Yi, Bo; Ma, Jiaxin

doi:10.3390/math14071197

Open AccessArticle

Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China

^*

Authors to whom correspondence should be addressed.

Mathematics 2026, 14(7), 1197; https://doi.org/10.3390/math14071197

Submission received: 29 January 2026 / Revised: 19 March 2026 / Accepted: 28 March 2026 / Published: 3 April 2026

(This article belongs to the Topic Intelligent Image Processing Technology)

Download

Browse Figures

Versions Notes

Abstract

As a leading method for Novel View Synthesis (NVS), 3D Gaussian Splatting (3DGS) faces limitations. Fixed thresholds governing Gaussian scale and opacity lead to over-reconstruction or under-reconstruction, while the linear penalty used for handling outliers during optimization tends to introduce artifacts. Therefore, we propose Adaptive 3DGS featuring a dynamic deletion mechanism. Specifically, our method calculates coverage for each Gaussian based on its scale during removal. Gaussians with high coverage face stricter scale thresholds to reduce over-reconstruction, while those with lower coverage receive lenient thresholds to preserve details. Simultaneously, transparency-based contribution assessment is applied. Gaussians with low contribution meet stricter transparency thresholds to combat over-reconstruction, while high-contribution ones get lenient thresholds to mitigate under-reconstruction. During optimization, introducing Huber loss promotes quadratic growth for small errors, reducing smoothing to alleviate artifacts and better preserve details. Evaluation on standard datasets shows our method improves peak signal-to-noise ratio (PSNR) by 0.3 dB over 3DGS and 0.5 dB over MS-3DGS at 4× resolution, and it achieves a 0.1 dB gain over Mip-Splatting, confirming its effectiveness and robustness.

Keywords:

Novel View Synthesis; 3D Gaussian Splatting; Gaussian ellipse; render

MSC:

68T07

1. Introduction

Novel View Synthesis (NVS) is a core research direction in computer graphics and computer vision with diverse applications, including virtual reality [1], UAV navigation [2], and medical imaging [3]. Advances in NVS have significantly improved image generation quality and rendering efficiency [4,5]. Notably, Neural Radiance Fields (NeRF) proposed by Mildenhall et al. [6] accelerated NVS development by utilizing a Multilayer Perceptron (MLP) to represent scene geometry and appearance, generating realistic views via volume rendering. However, NeRF faces challenges in computational efficiency, particularly for real-time rendering. Recently, 3D Gaussian Splatting (3DGS) [7] emerged as a solution, combining feature grid representations [8,9,10,11] with NeRF principles. 3DGS represents scenes using Gaussian distributions and employs explicit Gaussian point cloud rendering. This approach enables high-resolution, real-time performance and integration into standard GPU rasterization pipelines, making it a dominant practical NVS method.

Specifically, 3DGS represents scenes as sets of 3D Gaussians, rendered via splatting-based rasterization. Gaussian properties—position, size, orientation, transparency, and color—are optimized through multi-view photometric loss for precise reconstruction [7]. To enhance rendering, 3DGS applies 2D expansion in screen space for low-pass filtering, reducing aliasing and improving smoothness. Despite its efficiency and quality, 3DGS suffers from over-reconstruction, under-reconstruction, and floating artifacts. MS-3DGS [12] introduced multi-scale Gaussian representations to select appropriate Gaussians at different scales, mitigating over-reconstruction to some extent. Similarly, 2DGS [13] compressed 3D Gaussians into oriented 2D planar Gaussians, combining depth distortion and normal consistency principles to better capture complex details and alleviate under-reconstruction. Furthermore, filtering mechanisms in existing works [4,14] have partially suppressed artifacts. Critically, these approaches often address only a single issue: MS-3DGS focuses on over-reconstruction at low resolutions but neglects under-reconstruction, while 2DGS targets surface structure under-reconstruction but overlooks scene over-reconstruction.

As can be seen from the above analysis, over-reconstruction and under-reconstruction are essentially determined by the number of rendered Gaussian distributions. An overly dense Gaussian distribution triggers an overlapping effect, leading to over-reconstruction, while an overly sparse one causes information loss and results in under-reconstruction. Methods like 3DGS [7] and extensions [4,12] control Gaussian count via an elimination mechanism using fixed thresholds—one for overly large Gaussians (exceeding 10% of scene range) and another for low-transparency Gaussians (below 0.005). However, this universal fixed threshold leads to erroneous elimination decisions. Specifically, large Gaussian distributions that cover unexpected regions are erroneously retained, leading to over-reconstruction (see the red box in the upper left corner of Figure 1). Conversely, valuable low-transparency Gaussian distributions that contribute to details are wrongly removed, resulting in under-reconstruction (see the red box in the lower left corner of Figure 1).

In addition, popular methods [4,12,13,15] use the mean absolute error to optimize the rendered images. The linear penalty for small outliers can lead to over-optimization, significant result fluctuations, and artifacts (appearing as black fog-like shadows in the red box on the left side of Figure 2).

This study aims to simultaneously resolve over-reconstruction, under-reconstruction, and artifacts in 3DGS while preserving rendering efficiency. The main contributions are as follows:

A dynamic Gaussian deletion mechanism is proposed to adaptively eliminate redundant primitives. For over-reconstruction, coverage of each Gaussian to its surroundings is calculated based on scale, determining customized scale thresholds to eliminate high-coverage Gaussians (Figure 1 right). For under-reconstruction, contribution to reconstruction is evaluated based on transparency, establishing unique transparency thresholds to preserve high-contribution Gaussians (Figure 1 right).
A novel loss integration strategy is introduced by incorporating the Huber loss during training. This applies quadratic constraints to small errors instead of linear penalties, significantly reducing harsh optimization for severe outliers and alleviating artifacts (Figure 2 left, upper red boxes).

This paper is structured as follows. Section 2 analyzes artifacts and over-reconstruction in existing 3DGS methods. Section 3 introduces our Adaptive 3DGS framework. Section 4 details the dynamic parameter selection mechanism addressing under/over-reconstruction. Section 5 presents experimental validation and comparative analysis. Finally, Section 6 and Section 7 discuss the work and provide concluding remarks.

2. Related Work

2.1. Core Innovations of 3DGS

3DGS [7] introduced flexible geometric primitives [16,17,18,19,20,21,22] and anisotropic Gaussian ellipses to represent complex scene structures. Crucially, it replaced traditional ray tracing with rasterization for rendering. Compared with NeRF’s pointwise MLP evaluation [6,23,24,25,26,27,28], 3DGS projects Gaussians into screen space more efficiently, eliminating volume sampling and ray tracing to achieve real-time rendering [7]. Despite high reconstruction accuracy and efficiency, 3DGS inevitably suffers from over-reconstruction and under-reconstruction artifacts.

2.2. Mitigation of Over/Under-Reconstruction

Over-Reconstruction Solutions: MS-3DGS [12] addressed aliasing-induced over-reconstruction at low resolutions through multi-scale Gaussian representations, enabling adaptive selection during rendering. However, this approach proves ineffective at higher resolutions and fails to resolve under-reconstruction.

Under-Reconstruction Techniques: LM-Gaussian [29] alleviated detail loss via depth-guided initialization and iterative filtering. Geometric consistency checks first eliminate unreliable 3D points, followed by diffusion prior-based refinement to restore high-frequency details. Notably, this method incurs high training costs and requires large-scale datasets, limiting efficiency [29].

Critical Limitation: Existing methods (3DGS, MS-3DGS, LM-Gaussian) overlook the significance of the rendered Gaussian count in managing reconstruction quality. Over-reconstruction occurs when too many Gaussians cause overlap, while under-reconstruction stems from insufficient Gaussians losing detail. Current elimination mechanisms use fixed thresholds for scale (>10% scene range) and transparency (<0.005). However, the general threshold ignores the influence of different Gaussians: an excessively large Gaussian kernel covers unintended regions, resulting in over-reconstruction (as shown in the top left image of Figure 1), while valuable low-transparency Gaussian kernels that contribute to details are wrongly removed, leading to under-reconstruction (as shown in the bottom left image of Figure 1).

2.3. Artifact Reduction Approaches

Frequency-based Filtering: Mip-Splatting [4] limited Gaussian frequencies to below half the Nyquist rate and introduced a 2D Mip filter approximating projections as box filters similar to EWA-Splatting [30], reducing high-frequency artifacts and over-expansion artifacts.

Spectral Analysis and Splitting: Spectral-GS [14] revealed through covariance spectral analysis that unsplit elongated Gaussians with low spectral entropy cause artifacts. This was mitigated via 3D shape-aware splitting and 2D view-consistent filtering.

Persistent Challenges: While filtering reduces artifacts, excessive smoothing risks detail loss [4,14]. Moreover, 3DGS, Mip-Splatting, and Spectral-GS optimize rendered images using mean absolute error. The linear penalty for small outliers induces significant fluctuations, making artifacts inevitable [4,12,13,15].

3. Method Overview

In order for the model to achieve the effects of alleviating over-reconstruction, under-reconstruction, and reducing artifacts while maintaining efficiency, this paper starts from the number of Gaussian ellipses being rendered and the loss function for training optimization and improves the framework of 3DGS [7]. Compared with [7] and other subsequent methods, the improvements of the method proposed in this paper mainly include the following two aspects. First, a dynamic scale threshold elimination module and a dynamic transparency threshold elimination module are introduced on the basis of the original density control. This enables different elimination thresholds to be designed for different Gaussian ellipses, thereby more accurately eliminating overlapping Gaussian ellipses and alleviating the problem of over-reconstruction or under-reconstruction in 3DGS. Second, in the training process, this paper incorporates the Huber loss function, which smooths small errors by a quadratic function to alleviate the artifact phenomenon. The overall framework of the method proposed in this paper is shown in Figure 3. The improved parts compared with [7] are highlighted in red boxes.

In the existing 3DGS method [7], the adaptive density control module determines whether the current Gaussian ellipsoid contributes to the reconstruction based on a “one-size-fits-all” criterion. Specifically, Ref. [7] sets the scale removal threshold and transparency removal threshold in advance. For example, before removing the Gaussian ellipse, the scale removal threshold is set to 10% of the current scene range, while the transparency removal threshold is fixed at 0.005. This does not fully take into account the individualized influence of different Gaussian ellipses during the reconstruction process on the reconstruction result. Sometimes, it mistakenly retains the Gaussian ellipses with slightly larger scales, thereby covering over other Gaussian ellipses, ultimately leading to the phenomenon of over-reconstruction. Furthermore, this approach also leads to the incorrect exclusion of some Gaussian ellipses that would otherwise have contributed to the reconstruction, thereby resulting in insufficient reconstruction. Although the subsequent methods [12,29] introduced multi-scale Gaussian ellipses and diffusion priors to the model, which, to some extent, alleviated the aforementioned problems, the phenomenon of over- and under-reconstruction caused by the fixed threshold setting still exists. Therefore, this paper has developed a more flexible dynamic threshold calculation method to replace the original fixed threshold method. Figure 3 shows the dynamic scale threshold removal module and the dynamic transparency removal module of this paper.

During the density control process, we use the above two modules to separately calculate the scale removal threshold and transparency removal threshold for each Gaussian ellipse. This enables the model to adaptively consider the degree of influence of each Gaussian ellipse on the reconstruction result based on its parameter conditions and thereby decide whether to remove the ellipse. This mechanism can dynamically adjust the distribution density of 3D Gaussian ellipses during the density control process by synchronizing the complexity of the scene. This way, it can effectively alleviate the over-reconstruction and under-reconstruction problems caused by using a fixed threshold while ensuring efficiency.

Furthermore, in the rasterization rendering process, 3DGS optimizes the rendered images by using the mean absolute error. However, the linear growth characteristic of this method leads to the appearance of artifacts in the optimized images. Although [4,14] eliminated artifacts by introducing filters, they also suffered from excessive smoothing of the filters, resulting in loss of details. Moreover, these methods, similar to 3DGS, during rasterization rendering still had some artifacts because of the use of the mean absolute error method. Therefore, in this paper, by introducing the Huber loss function, the small errors between the reconstructed image and the real image are selectively used with quadratic penalty constraints based on the situation. This enables the model to avoid the problem of detail loss caused by the filter and at the same time alleviates the image artifacts caused by the strict linear penalty on the small errors.

4. Method Detailed Description

Previous studies [7,22] proposed representing 3D scenes as a set of scaled 3D Gaussian ellipses: {gk | k = 1, …, K}, and using volume splatting to render the images onto a 2D display screen. The geometric shape of each scaled 3D Gaussian ellipsoid

G_{k}

is parameterized by transparency (scale)

α_{k} \in [0, 1]

, center

p_{k} \in R^{3 \times 1}

, and the covariance matrix

Σ_{k} \in R^{3 \times 3}

defined in world space, as shown in Equation (1).

G_{k} (x) = {α_{k} e}^{- \frac{1}{2} (x - p_{k})^{T} Σ_{k}^{- 1} (x - p_{k})}

(1)

To constrain

Σ_{k}

within the valid covariance matrix space, 3DGS [7] employed a semi-definite parameterized

Σ_{k} = O_{k} s_{k} s_{k}^{T} O_{k}^{T}

. Here,

s \in R^{3}

is a 3-row-by-1-column scaling vector, and

O \in R^{3 \times 3}

is a 3-row-by-3-column rotation matrix. It is parameterized using quaternions [31].

Then, to render the image for a specific viewpoint defined by the rotation matrix

R \in R^{3 \times 3}

R and the translation vector

t \in R^{3}

, we first convert the 3D Gaussian {

G_{k}

} into the camera coordinates, as shown in Equation (2).

p_{k}^{'} = R p_{k} + t, Σ_{k}^{'} = R Σ_{k} R^{T}

(2)

Subsequently, by projecting onto the ray space through a local affine transformation, Equation (3) can be obtained.

Σ_{k}^{″} = J_{k} Σ_{k}^{'} J_{k}^{T}

(3)

where the Jacobian matrix

J_{k}

is an affine approximation of the projective transformation defined by the 3D Gaussian

p_{k}^{'}

center.

Finally, 3DGS [7] models the color C related to the view using spherical harmonics and blends the rendered images based on the depth order of the primitives 1, …, K through

α

, as shown in Equation (4).

c (x) = \sum_{k = 1}^{K} c_{k} α_{k} G_{k}^{2 D} (x) \prod_{j = 1}^{k - 1} (1 - α_{j} G_{j}^{2 D} (x))

(4)

In addition, 3DGS and subsequent methods [4,12,14] also perform density control during the projection process, taking into account the influence of the number of Gaussian ellipses on the reconstruction results. However, during this process, these methods are consistent with the conditions for 3DGS to eliminate Gaussian ellipses, that is, by using two fixed values to separately filter the scale and transparency of the Gaussian ellipses. This one-size-fits-all approach is not precise enough for controlling the Gaussian elliptical density, which can lead to over-reconstruction and under-reconstruction phenomena.

4.1. Dynamic Scale Control

During the density control process, 3DGS eliminates Gaussian ellipses whose scale exceeds a certain threshold. The specific process is as shown in Equation (5).

s v > 0.1 \times e x

(5)

where

s v

represents the scale of the current Gaussian ellipse, and

e x

represents the size of the current scene range.

Since the size of each scene is fixed, this results in a fixed-scale removal threshold for that scene as well. This fixed-scale thresholding method can be applied to most situations and can eliminate the Gaussian ellipses with excessive scales. However, this method inevitably retains some ellipses that are slightly larger in size but still cover the surrounding Gaussian ellipses, ultimately leading to the occurrence of the over-reconstruction phenomenon.

Therefore, this paper proposes a method that can dynamically calculate the scale of the current Gaussian ellipse based on the current scale parameters of the Gaussian ellipse, thereby setting the elimination threshold. Specifically, it is as shown in Equation (6). Threshold values align with Reference [7].

T_{S} = ({m a x}_{s} - {m i n}_{s}) \times s \times β + β \times {m i n}_{s}

(6)

where

T_{S}

represents the scale dynamic threshold;

{m i n}_{s}

and

{m a x}_{s}

are the preset minimum and maximum thresholds for the scale.

s

represents the current scale of the Gaussian ellipse, and

β

is a control parameter set based on experience.

According to Equation (6), TS first calculates the coverage degree

({m a x}_{s} - {m i n}_{s}) \times s \times β

of the current Gaussian ellipse with respect to the surrounding Gaussian ellipses and then makes a minor adjustment (

β \times {m i n}_{s}

) based on this. This operation can calculate different scale removal thresholds for each Gaussian ellipse, making the entire reconstruction more flexible. For smaller-scale Gaussian ellipses, a more lenient retention can be adopted, while for larger-scale Gaussian ellipses, a stricter elimination should be applied. This dynamic scale control has, to a certain extent, alleviated the problem of excessive reconstruction caused by the original fixed threshold.

4.2. Dynamic Transparency Control

3DGS not only controls the density of the Gaussian ellipses by adjusting the scale, but also prunes the Gaussian ellipses by altering their transparency. If the current transparency of the Gaussian ellipse is less than the transparency removal threshold, the SDGS method considers that this Gaussian ellipse will not have an effect on the reconstruction and thus removes it, as shown in Equation (7).

o p < m i n_{o p}

(7)

where

o p

represents the transparency of the current Gaussian ellipse and

m i n_{o p}

is the transparency removal threshold.

In the current 3DGS method, all scenes are set to 0.005. It is worth noting that in some scenarios, even if the transparency of the Gaussian ellipses is less than 0.005, these ellipses still play a role in the reconstruction of the scene in terms of details. Therefore, the original fixed threshold inevitably leads to the exclusion of these Gaussian ellipses, resulting in insufficient reconstruction.

Therefore, this paper designs a method that can dynamically calculate the transparency by combining the current Gaussian ellipse transparency and the scene transparency and then eliminates the Gaussian ellipse in this way, as shown in Equation (8). Threshold values align with Reference [7].

T_{o} = ({m a x}_{O} - {m i n}_{O}) \times o p \times α + α \times {m a x}_{O} \times {m i n}_{O}

(8)

where

T_{o}

represents the transparency culling threshold,

{m i n}_{O}

and

{m a x}_{O}

denote the predetermined minimum and maximum transparency thresholds,

o p

indicates the current transparency of the Gaussian ellipse, and

α

is a control parameter set based on empirical experience.

The current transparency culling threshold of the Gaussian ellipse consists of two parts. The first part,

({m a x}_{O} - {m i n}_{O}) \times o p \times α

, represents the contribution of the current transparency to the reconstruction, and the second part,

α \times {m a x}_{O} \times {m i n}_{O}

, is an empirical fine-tuning of the threshold. According to Equation (8), by calculating different transparency culling thresholds for different Gaussian ellipses, the model can completely cull Gaussian ellipses with extremely low transparency, partially cull those with moderate transparency, and retain those with high transparency. This dynamic elimination mechanism can more accurately retain those Gaussian ellipses that help reconstruct details based on transparency, thereby alleviating the phenomenon of insufficient reconstruction caused by the previous fixed-threshold elimination of ellipses.

4.3. Loss Function

During the rendering optimization process, 3DGS and subsequent approaches [4,12,14] primarily employ the mean absolute error (MAE, L₁ loss) to quantify the discrepancy between rendered and ground truth images, as formulated in Equation (9).

M A E = \frac{1}{N} \sum_{i = 1}^{N} |y_{i} - y_{i}^{'}|

(9)

where

N

is the total predicted quantity,

y_{i}

is the true value of the i-th pixel, and

y_{i}^{'}

is the predicted value of the i-th pixel. As can be inferred from Equation (9), the L1 loss function increases linearly in magnitude. It should be noted that the accumulation of minor linearly scaling errors in images may lead to excessive optimization instability, potentially resulting in the generation of imaging artifacts.

The total loss in this paper is that an additional Loss function is added in [12], as shown in Equation (10):

L_{t o t a l} = (1 - λ) L_{δ} + λ L_{D - S S I M}

(10)

where

L_{δ}

addresses outlier-induced artifacts through adaptive error scaling and

L_{D - S S I M}

enhances structural consistency. The current combination was selected based on its proven efficacy in balancing artifact reduction and detail preservation for 3DGS optimization.

To alleviate the image artifacts caused by the above operations, this paper introduces the Huber loss function on the basis of the original loss function, as shown in Equation (10). The main function of the Huber loss function is to robustly handle outliers, that is, abnormal values, in the loss function. Compared with the traditional squared error loss function, namely the L2 loss function, and the absolute error loss function, namely the L1 loss function [4], the Huber loss function is not only more sensitive to large outliers but also less affected by small outliers. Hence, this function possesses unique advantages in balancing the model’s capacity to process both normal and anomalous data. The Huber loss function is shown in Equation (11).

L_{δ} (y, f (x)) = \{\begin{matrix} \frac{1}{2} (y - f (x))^{2} i f | y - f (x) | \leq δ \\ δ (| y - f (x) | - \frac{1}{2} δ) o t h e r w i s e \end{matrix}

(11)

where

y

represents the actual value,

f (x)

is the value from the model, and

δ

is usually referred to as the truncation threshold of the Huber loss function. In practical applications, choosing the appropriate

δ

value is crucial. People often determine the optimal

δ

value through cross-validation or experimental optimization, so that the Huber loss function can achieve the best effect when dealing with different types of artifact phenomena.

Another major advantage of introducing the Huber loss function lies in its high computational efficiency. Compared with directly using the L1 or L2 loss function, the Huber loss function can provide a more stable and faster convergence process in most cases, thereby accelerating the training speed of the model. This paper effectively reduces the influence of outliers on the model by adding the Huber loss function to the loss function, thereby providing better detail retention and structural fidelity and improving the overall reconstruction effect to a certain extent.

5. Experimental Analysis

We implemented the experiments in Python 3.7.13 with PyTorch 1.12.1, using custom CUDA kernels for rasterization. These kernels are extended versions of the previous method [32]. Additionally, the NVIDIA CUB sorting routine was used for the fast radix sorting [23]. Moreover, the experiments in this paper were conducted on an NVIDIA GeForce RTX3090 server with 24 GB of GPU memory.

5.1. Selection of the Dataset

The dataset selected in this paper is largely consistent with those commonly evaluated by 3DGS methods, specifically including the Synthetic Blender dataset [6], the Mip-NeRF360 dataset [33], and the Tanks&Temples dataset [34]. These datasets feature very diverse shooting styles, encompassing both confined indoor scenes and large, unbounded outdoor scenes, as well as both synthetic and real-world settings. The diversity of the datasets makes the experimental results of this paper more generalizable and convincing. The characteristics of each dataset and their roles in the experimental evaluation are described specifically as follows:

The Synthetic Blender dataset [6] contains 8 synthetic bounded scenes. These scenarios provide a detailed set of views and precise camera parameters, enabling the personnel selecting this dataset to use the random initialization method to test the effectiveness of the method. Meanwhile, due to the fact that the synthetic dataset has precise labels and a realistic quality, researchers can conduct extensive experiments without the need for actual data collection costs and restrictions, thereby verifying the generalization ability of the methods presented in this paper.

The Mip-NeRF360 dataset [33] is an advanced benchmark for NeRF rendering quality. It offers challenging real-world indoor/outdoor scenes, high-quality rendered images, and accurate depth maps. This dataset includes 7 complex real-world scenarios, each featuring detailed central objects with rich backgrounds. These characteristics make it ideal for evaluating high-fidelity rendering performance.

The Tanks&Temples dataset [34] provides real-world outdoor scenes and is challenging and diverse. The scenarios of this dataset include various geometric shapes, different lighting conditions, and various materials. Therefore, it can be used to evaluate the performance and generalization ability of the method in complex environments. Furthermore, this dataset provides accurate camera orientation and depth information, which is even more conducive to accurately evaluating the rendering quality of the method.

5.2. Selection of Evaluation Criteria

To facilitate comparison with 3DGS, we selected the same three evaluation metrics: PSNR, structural similarity index measurement (SSIM), and learning perceptual image patch similarity (LPIPS). These three evaluation criteria can support the quantitative measurement of different aspects of the image and are widely used in image quality assessment.

PSNR: It is a commonly used metric for evaluating image quality. It calculates the peak signal-to-noise ratio between the original image and the image that has been compressed or processed, which is the ratio of the maximum possible power of the image to the average power of the distortion introduced. This indicator provides a quantitative measurement of the difference between the generated image and the real image. The higher the PSNR value, the better the image quality. The calculation of PSNR is as shown in Equation (12).

P S N R = 10 \cdot \log_{10} (\frac{{M A X}^{2}}{M S E})

(12)

SSIM: It is an indicator used to compare the similarity between two images. SSIM takes into account the brightness, contrast, and structural information of the image. It measures the similarity between two images by calculating the similarity of brightness, contrast, and structure. This index can provide a more detailed measurement of image similarity, capable of capturing the degree of structural and content similarity between rendered images and real images. The SSIM value ranges from −1 to 1. The closer it is to 1, the more the rendered result approximates the real image. Its calculation is shown in Equation

(13)

.

S S I M (x, y) = \frac{(2 μ_{x} μ_{y} + c_{1}) (2 σ_{x y} + c_{2})}{(μ_{x}^{2} + μ_{y}^{2} + c_{1}) (σ_{x}^{2} + σ_{y}^{2} + c_{2})}

(13)

where

μ_{x}

and

μ_{y}

represent the mean values of images

x

and

y

,

σ_{x}^{2}

and

σ_{y}^{2}

represent the variances of images

x

and

y

,

σ_{x y}

represents the covariance of images

x

and

y

, and

c_{1}

and

c_{2}

are constants used to stabilize the calculation of the denominator.

LPIPS: It is a deep learning-based method for measuring image similarity. It employs pre-trained deep learning models to learn the perceptual features of images and measures the similarity between images by calculating the distance of the images in the feature space. LPIPS can better simulate the human visual system’s perception of images. Therefore, when it is used to evaluate rendering models, it can provide more accurate image similarity measurements, which are more consistent with human subjective perception. The smaller the LPIPS value is, the closer the image is to the human subjective perception, meaning the image quality is better. The calculation of LPIPS is shown in Equation

(14)

.

L P I P S (x, y) = \sum_{i} w_{i} ∥ ϕ_{i} (x) - ϕ_{i} (x) ∥

(14)

where

ϕ_{i} (x)

and

ϕ_{i} (x)

represent the pixel values of image

x

and

y

in the i-th feature channel, and

w_{i}

is the weight for each feature channel.

5.3. Ablation Experiment Verification

This paper conducts the validation of the ablation experiments from the following aspects: the addition of the dynamic Gaussian elimination mechanism, the introduction of the loss function, and the adjustment of relevant parameters. The experimental results are shown in Table 1. 7k and 30k represent the number of iterations.

This article conducts experiments to test the reconstruction effects with and without the inclusion of the dynamic Gaussian elimination mechanism. Ablation studies validate three components: dynamic Gaussian elimination mechanism (D), hybrid loss (H), and relevant parameter adjustments. As shown in Table 1 (row D), the D variant consistently improves PSNR over vanilla 3DGS across all datasets (Train, Garden, Kitchen) at both 7k and 30k iterations. This proves that the improvement of this module is effective for the original method 3DGS. The specific visualization situation is shown in Figure 4. As can be seen from Figure 4, when using the original elimination mechanism of 3DGS, there are cases of over-reconstruction or under-reconstruction.

Loss Function. This section presents the experimental comparison analysis before and after the introduction of the Huber loss function. From the “H, δ = 5” row in Table 1, it can be seen that after adding the new loss function, this method has also improved the PSNR score compared with 3DGS. The specific visualization results are shown in Figure 5. As can be seen from Figure 5, before the introduction of the Huber loss function, the red box area in the upper left graph exhibited a ghosting phenomenon. However, after the introduction of the Huber loss function, the white-like ghosting in the right graph was significantly alleviated. Furthermore, by comparing the two bottom left and bottom right figures in Figure 5, it can be seen that the Huber loss function also has certain effects in removing artifacts and blurring.
Parameter Influence. Since the loss function and dynamic Gaussian removal used in this paper employ parameters obtained based on experience, in this ablation experiment, this paper evaluated the impact of different parameter variations on the overall method’s optimization. From the last three rows of Table 1 and Figure 6, it can be observed that when δ = 5, the PSNR reaches its peak. This article has provided a partial visualization of the results obtained through different parameter optimizations, as shown in Figure 7. From Figure 7, it can be observed that when δ is either too large or too small, obvious artifacts will occur. However, when δ is 5, which is a suitable value, this artifact phenomenon does not occur.

Furthermore, since the parameters α and β in Equations (7) and (8) have a direct impact on the scale removal threshold and transparency removal threshold for each Gaussian ellipse and also determine the overall quality of the reconstruction, this paper has conducted a series of ablation experiments on these two parameters, as shown in Figure 8.

5.4. Experimental Comparison

The method proposed in this paper is based on the recent open-source 3DGS code repository [7]. Apart from the loss function and the Gaussian density control strategy, which are different from [7], all other settings are the same as in [7]; that is, 30K iterations are conducted in all scenarios, and the same schedule and hyperparameters are used. The Algorithm 1 in this paper was tested in a total of 9 real-world scenarios and 8 synthetic scenarios. The complexity and diversity of the scenarios can prove that the improvements in this paper are adaptive and robust. This paper takes 3DGS [7] as the quality benchmark to compare with the method proposed in this paper. In addition, this paper also makes comparisons with a recent fast NeRF method, InstantNGP [11], a high-quality effect method, Mip-NeRF360 [35], and three recent improved methods of 3DGS, namely Mip-Splatting [4], MS-3DGS [12], and LM-Gaussian [29].

Algorithm 1: Adaptive 3DGS

Input:

Initial Gaussians G

, Images {I_{i}}

, Huber δ

1: for each iteration do

2:

Render image \hat{I}

from viewpoint i

3:

Compute loss L_{δ} = H u b e r (\hat{I}, I_{i}, δ)

4:

for each Gaussian g

in G

do

5:

Calculate coverage C_{g} = f (s c a l e (g), n e i g h b o r s (g))

6:

Calculate contribution R_{g} = f (o p a c i t y (g), \hat{I})

7: Update thresholds:

8:

τ_{s c a l e} = α * C_{g} + β * R_{g}

9:

τ_{o p a c i t y} = γ * R_{g}

# Eq(4b)

10:

if g . s c a l e > τ_{s c a l e}

or g . o p a c i t y < τ_{o p a c i t y}

then

11:

Mark g

for culling

12: end if

13: end for

14:

Backpropagate \nabla L

to update G

15: Remove marked Gaussians

16: end for

17:

return Optimized G

The processing of the dataset in this paper is the same as that in 3DGS [7], that is, it has been split into training and testing sets. The same methodology as in 3DGS [7] is also adopted, that is, one out of every eight photos is taken as the test set, so as to make consistent and meaningful comparisons to generate error metrics. The experimental results are presented in Table 2. All the data in the table, except for the results of Mip-NeRF360 on its dataset, which were quoted from the original publication, were obtained by running the author’s code of previous methods in this paper. As can be seen from Table 2, the method proposed in this paper outperforms MS-3DGS and Mip-Splatting on both the Mip-NeRF360 [33] and Tanks&Temples [34] datasets. Although LM-Gaussian is superior to our method in terms of quality, its training time is almost twice as long due to the introduction of a diffusion prior-based model.

In this section, this paper presents a detailed quantitative comparison of the algorithm proposed herein with the recent Mip-Splatting [4] and MS-3DGS [12] on seven scenes of the Mip-NeRF360 dataset, as shown in Table 3. All these data were obtained from the same computer configuration and the same environment, and the datasets were all from the 4× resolution of the Mip-NeRF360 dataset.

From the quantitative comparison in Table 3, it can be seen that in the four indoor scenes of counter, room, bonsai, and kitchen, the method proposed in this paper is leading. However, in some outdoor scenes, the method in this paper is slightly inferior to Mip-Splatting. This might be because, in the unbounded outdoor scenes, the method in this paper still lacks some details in the reconstruction of corners.

This paper also presents a visual comparison between the proposed method and the previous SDGS [7] in Figure 9, and both have undergone 30k training iterations. It can be observed from the figure that the over-reconstruction and under-reconstruction phenomena caused by the simple and fixed Gaussian elimination mechanism in the previous 3DGS method have been significantly improved in the method proposed in this paper, for example, the details of the grass background in “bicycle”, the distant hills in “train”, the background buildings in “truck”, and the window details in the background of “garden”. Meanwhile, due to the introduction of the Huber loss function, in some cases, the algorithm proposed in this paper can also achieve certain deblurring and artifact removal effects, for example, the sofa in the room, the vegetation on the stump, and the artifacts in the bonsai. As can be seen from the figures, the method proposed in this paper can maintain a good coverage range even for distant backgrounds and reconstruct certain details. Moreover, it can also reconstruct more details in some areas with monotonous colors and materials.

As can be seen from Table 4 and Table 5, the improved method in this paper has achieved comprehensive improvements over both 3DGS and Mip-Splatting on this dataset. This also indicates that the improvements made in this paper are stable and robust. Additionally, a qualitative comparison was made with several previous methods on this dataset, as shown in Figure 10. From left to right in Figure 10: Results from 3DGS [7], Mip-NeRF360 [15], our method, and the ground truth (GT) reference. This addition explicitly identifies the method corresponding to each image group in the figure, enhancing readability and eliminating ambiguity. From Figure 10, it can be observed that the method proposed in this paper not only alleviates the artifact problem but also avoids the detail loss caused by the filter introduced in Mip-Splatting [4].

6. Discussion

This paper proposes an improved 3DGS technique. This technique augments original 3DGS [7] with a dynamic Gaussian removal mechanism and a coordinated hybrid loss, achieving higher reconstruction accuracy. The dynamic Gaussian elimination mechanism more flexibly sets the scale and transparency threshold of the Gaussian ellipse, precisely eliminating overly large Gaussian ellipses to reduce the phenomenon of over-reconstruction, and retains some Gaussian ellipses that should play a role in the reconstruction process, thereby alleviating the problem of under-reconstruction. The introduction of the Huber loss function reduces the outliers during the training process, thereby alleviating the blurring artifacts in the reconstructed images. When trained and tested under the same scale and sampling rate, our experimental results show that the technique presented in this paper is also competitive in performance with the state-of-the-art methods. Meanwhile, the method proposed in this paper can more flexibly handle the reconstruction of different scene environments, demonstrating stronger robustness and adaptability.

7. Conclusions

3DGS is a leading method for Novel View Synthesis. However, its dependence on fixed scale and opacity thresholds during Gaussian ellipsoid culling causes issues of over-reconstruction or under-reconstruction. Additionally, using linear penalties for outliers in optimization introduces artifacts into the results. To address these limitations, this paper proposes Adaptive 3DGS. This approach features a dynamic culling mechanism and a Huber loss function. The dynamic culling mechanism adaptively adjusts thresholds based on two factors. These factors are each ellipsoid’s coverage relative to neighbors and its reconstruction contribution. Both derive from the ellipsoid’s scale and opacity. This method imposes stricter scale thresholds on ellipsoids with excessive coverage to reduce over-reconstruction. Conversely, it applies looser thresholds to ellipsoids with insufficient coverage to preserve detail. Similarly, stricter opacity thresholds remove low-contribution ellipsoids to prevent over-reconstruction. Looser thresholds retain high-contribution ellipsoids to alleviate under-reconstruction. Furthermore, the Huber loss function replaces linear penalties. It imposes a quadratic penalty on small errors and a linear penalty on larger ones. This minimizes artifacts like blurring and better preserves fine details. Comprehensive evaluations on standard 3DGS datasets confirm the efficacy of the Adaptive 3DGS. The method achieves an average PSNR improvement of 0.3 dB over the original 3DGS. At 4× resolution, it outperforms MS-3DGS by 0.5 dB. It also surpasses Mip-Splatting by 0.1 dB. These results show that the Adaptive 3DGS enables more precise and high-quality reconstruction. It exhibits significantly enhanced robustness and adaptability across diverse scenes under equivalent training and testing conditions.

Despite these advantages, two limitations should be acknowledged. First, the dynamic threshold calculations may increase computational overhead during training. Second, the method’s effectiveness in extremely high-density scenes requires additional validation. Future research will focus on optimizing the threshold computation pipeline to reduce training costs, extending the framework to handle ultra-dense scene reconstructions, and exploring adaptive loss functions for broader artifact types.

Author Contributions

Validation, B.Y. and J.M.; writing—original draft, F.Z.; writing—review and editing, F.Z., Y.W., and B.Y.; supervision, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key Research and Development Program under grant No. (2023YFC3805901) and by the “Taihu Talent-Innovative Leading Talent Team” Plan of Wuxi City (Certificate Date: 20 December 2024 (8)).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PSNR	Peak signal-to-noise ratio
NVS	Novel View Synthesis
NeRF	Neural Radiance Fields
MLP	Multilayer Perceptron
3DGS	3D Gaussian Splatting
SSIM	Structural similarity index measurement
LPIPS	Learning perceptual image patch similarity

References

Bahirat, K.; Lai, C.; Mcmahan, R.P.; Prabhakaran, B. Designing and evaluating a mesh simplification algorithm for virtual reality. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2018, 14, 1–26. [Google Scholar] [CrossRef]
Padhy, R.P.; Sa, P.K.; Narducci, F.; Bisogni, C.; Bakshi, S. Monocular vision-aided depth measurement from RGB images for autonomous UAV navigation. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 20, 1–22. [Google Scholar] [CrossRef]
Zhang, S.; Wang, Y.; Liu, P.; Wang, Y.; Huang, L.; Wang, M.; Atadjanov, I. Capsule endoscopy image enhancement for small intestinal villi clarity. Mathematics 2024, 12, 3317. [Google Scholar] [CrossRef]
Yu, Z.; Chen, A.; Huang, B.; Sattler, T.; Geiger, A. Mip-splatting: Alias-free 3D Gaussian splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–21 June 2024; pp. 19447–19456. [Google Scholar]
Liu, B.; Lei, J.; Peng, B.; Yu, C.; Li, W.; Ling, N. Novel view synthesis from a single unposed image via unsupervised learning. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 19, 1–23. [Google Scholar] [CrossRef]
Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 2021, 65, 99–106. [Google Scholar] [CrossRef]
Kerbl, B.; Kopanas, G.; Leimkühler, T.; Drettakis, G. 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 2023, 42, 1–14. [Google Scholar] [CrossRef]
Chen, A.; Xu, Z.; Geiger, A.; Yu, J.; Su, H. Tensorf: Tensorial radiance fields. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 333–350. [Google Scholar]
Fridovich-Keil, S.; Yu, A.; Tancik, M.; Chen, Q.; Recht, B.; Kanazawa, A. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–21 June 2022; pp. 5501–5510. [Google Scholar]
Liu, L.; Gu, J.; Zaw Lin, K.; Chua, T.S.; Theobalt, C. Neural sparse voxel fields. Adv. Neural Inf. Process. Syst. 2020, 33, 15651–15663. [Google Scholar]
Müller, T.; Evans, A.; Schied, C.; Keller, A. Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. 2022, 41, 1–15. [Google Scholar] [CrossRef]
Yan, Z.; Low, W.F.; Chen, Y.; Lee, G.H. Multi-scale 3D Gaussian splatting for anti-aliased rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 20923–20931. [Google Scholar]
Huang, B.; Yu, Z.; Chen, A.; Geiger, A.; Gao, S. 2D Gaussian splatting for geometrically accurate radiance fields. In Proceedings of the ACM SIGGRAPH 2024 Conference Papers, Denver, CO, USA, 28 July–1 August 2024; pp. 1–11. [Google Scholar]
Huang, L.; Guo, J.; Dan, J.; Fu, R.; Li, Y.; Guo, Y. Spectral-GS: Taming 3D Gaussian splatting with spectral entropy. arXiv 2024, arXiv:2409.12771. [Google Scholar]
Turkulainen, M.; Ren, X.; Melekhov, I.; Seiskari, O.; Rahtu, E.; Kannala, J. Dn-splatter: Depth and normal priors for Gaussian splatting and meshing. In Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 5–9 January 2025; pp. 2421–2431. [Google Scholar]
Gross, M.; Pfister, H. (Eds.) Point-Based Graphics; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
Grossman, J.P.; Dally, W.J. Point sample rendering. In Proceedings of the Eurographics Workshop, Vienna, Austria, 29 June–1 July 1998; Springer: Vienna, Austria, 1998; pp. 181–192. [Google Scholar]
Pfister, H.; Zwicker, M.; Van Baar, J.; Gross, M. Surfels: Surface elements as rendering primitives. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 23–28 July 2000; pp. 335–342. [Google Scholar]
Lassner, C.; Zollhofer, M. Pulsar: Efficient sphere-based neural rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 19–25 June 2021; pp. 1440–1449. [Google Scholar]
Prokudin, S.; Ma, Q.; Raafat, M.; Valentin, J.; Tang, S. Dynamic point fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 7964–7976. [Google Scholar]
Sun, G.; Wong, Y.; Kankanhalli, M.S.; Li, X.; Geng, W. Enhanced 3D shape reconstruction with knowledge graph of category concept. ACM Trans. Multimed. Comput. Commun. Appl. 2022, 18, 1–20. [Google Scholar] [CrossRef]
Zwicker, M.; Pfister, H.; Van Baar, J.; Gross, M. EWA volume splatting. In Proceedings of the IEEE Visualization, San Diego, CA, USA, 21–26 October 2001; pp. 29–538. [Google Scholar]
Merrill, D.G.; Grimshaw, A.S. Revisiting sorting for GPGPU stream architectures. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, Vienna, Austria, 11–15 September 2010; pp. 545–546. [Google Scholar]
Max, N. Optical models for direct volume rendering. IEEE Trans. Vis. Comput. Graph. 1995, 1, 99–108. [Google Scholar] [CrossRef]
Max, N.; Chen, M. Local and Global Illumination in the Volume Rendering Integral; Lawrence Livermore National Lab. (LLNL): Livermore, CA, USA, 2005. [Google Scholar]
Chen, Z.; Zhang, H. Learning implicit fields for generative shape modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5939–5948. [Google Scholar]
Mescheder, L.; Oechsle, M.; Niemeyer, M.; Nowozin, S.; Geiger, A. Occupancy networks: Learning 3D reconstruction in function space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 4460–4470. [Google Scholar]
Park, J.J.; Florence, P.; Straub, J.; Newcombe, R.; Lovegrove, S. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 165–174. [Google Scholar]
Yu, H.; Long, X.; Tan, P. LM-Gaussian: Boost sparse-view 3D Gaussian splatting with large model priors. arXiv 2024, arXiv:2409.03456. [Google Scholar]
Zwicker, M.; Pfister, H.; Van Baar, J.; Gross, M. EWA splatting. IEEE Trans. Vis. Comput. Graph. 2002, 8, 223–238. [Google Scholar] [CrossRef]
Reiser, C.; Peng, S.; Liao, Y.; Geiger, A. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 14335–14345. [Google Scholar]
Kopanas, G.; Philip, J.; Leimkühler, T.; Drettakis, G. Point-based neural rendering with per-view optimization. Comput. Graph. Forum 2021, 40, 29–43. [Google Scholar] [CrossRef]
Barron, J.T.; Mildenhall, B.; Tancik, M.; Hedman, P.; Martin-Brualla, R.; Srinivasan, P.P. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 5855–5864. [Google Scholar]
Knapitsch, A.; Park, J.; Zhou, Q.Y.; Koltun, V. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Trans. Graph. 2017, 36, 1–13. [Google Scholar] [CrossRef]
Barron, J.T.; Mildenhall, B.; Verbin, D.; Srinivasan, P.P.; Hedman, P. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–21 June 2022; pp. 5470–5479. [Google Scholar]

Figure 1. Comparison of over-reconstruction and under-reconstruction phenomena. The images on the left depict the effects obtained using the fixed threshold method, with the upper-left image showing over-reconstruction and the lower-left image demonstrating under-reconstruction; the images on the right illustrate the improved effects achieved by the method proposed in this paper, where the upper-right image eliminates the problem of over-reconstruction and the lower-right image resolves the issue of under-reconstruction.

Figure 2. Comparison of blurring artifacts phenomenon ((left) are the effects of mean absolute error; (right) are the improved effects of the method proposed in this paper).

Figure 3. Overview of the method diagram.

Figure 4. Dynamic Gaussian elimination image comparison ((left) is the original method; (right) is the improved effect of the method proposed in this paper).

Figure 5. Comparison before and after the introduction of the Huber loss function ((left) are before the introduction; (right) are after the introduction).

Figure 6. The relationship between the value of δ and PSNR.

Figure 7. The influence of δ on the reconstruction result ((left): δ value is too large or too small; (right): δ = 5 is the perfect value).

Figure 8. The influence of β and α values on PSNR.

Figure 9. Comparison between this work and previous methods.

Figure 10. Comparison of the Synthetic Blender dataset.

Table 1. The PSNR value in the ablation experiment.

Title 1	Train-7k	Garden-7k	Kitchen-7k	Train-30k	Garden-30k	Kitchen-30k	Avg-7k	Avg-30k
3DGS	19.89	26.10	28.68	22.51	27.26	31.74	24.89	27.17
D	20.39	26.22	29.80	23.20	27.40	32.04	25.47	27.54
H, δ = 5	19.62	26.20	29.75	22.65	27.28	32.04	25.19	27.32
H, δ = 1	19.60	26.18	29.41	22.62	27.33	31.89	25.06	27.28
H, δ = 10	19.43	26.16	29.24	22.52	27.20	31.92	24.94	27.21
Full, δ = 1	20.38	26.32	30.07	23.13	27.41	32.21	25.59	27.58
Full, δ = 10	20.12	26.39	30.06	23.07	27.40	32.14	25.52	27.53
Full, δ = 5	20.22	26.44	30.18	23.30	27.52	32.38	25.61	27.73

Table 2. Quantitative evaluation of the method proposed in this paper and the previous methods.

Dataset Method\|Metric	Mip-NeRF360				Tanks&Temples
Dataset Method\|Metric	SSIM	PSNR	LPIPS	Train	SSIM	PSNR	LPIPS	Train
Plenoxels [9]	0.683	23.74	0.396	22 min 14 s	0.708	20.63	0.386	21 min 28 s
InstantNGP [11]	0.723	26.26	0.306	8 min 34 s	0.648	20.20	0.408	7 min 22 s
Mip-NeRF360 [35]	0.871 ^{^}	29.41 ^{^}	0.140 ^{^}	43 h	0.861	24.26	0.172	43 h
3DGS [7]	0.864	29.20	0.142	33 min 32 s	0.868	24.46	0.168	26 min 41 s
MS-3DGS [12]	0.837	28.29	0.182	32 min 21 s	0.865	24.42	0.169	26 min 12 s
Mip-Splatting [4]	0.876	29.45	0.135	33 min 56 s	0.871	25.11	0.163	27 min 14 s
LM-Gaussian [29]	0.884	29.73	0.129	67 min 12 s	0.880	25.41	0.154	61 min 08 s
Ours	0.878	29.49	0.133	34 min 44 s	0.873	25.25	0.161	27 min 24 s

The results marked with ^ in the table are directly from the original paper, and all other results are from our own experiments in this paper.

Table 3. Comparison of PSNR values on the Mip-NeRF360 dataset.

Dataset\|Method	Bicycle	Garden	Counter	Room	Stump	Bonsai	Kitchen
3DGS [7]	25.18	27.22	29.24	31.94	26.56	32.56	31.74
MS-3DGS [12]	25.00	27.04	26.39	28.98	26.49	32.22	31.91
Mip-Splatting [4]	25.73	27.88	29.34	31.83	27.16	32.41	31.80
Ours	25.29	27.52	29.45	32.21	26.75	32.83	32.38

Table 4. PSNR score of the Synthetic Blender dataset.

Dataset\|Method	Chair	Drums	Ficus	Hotdog	Lego	Materials	Mic	Ship	Avg
Plenoxels [9]	33.68	25.42	31.54	36.22	33.91	29.07	33.26	29.43	31.56
InstantNGP [11]	35.40	25.80	34.00	37.10	35.60	29.40	35.90	30.30	32.93
Mip-NeRF360 [35]	35.32	25.35	33.18	37.44	35.43	30.56	36.47	30.29	33.00
3DGS [7]	35.52	26.27	35.49	38.08	36.04	30.49	36.70	31.67	33.78
MS-3DGS [12]	35.34	26.29	35.12	36.88	35.41	30.47	36.49	31.52	33.44
Mip-Splatting [4]	35.75	26.33	35.88	38.11	35.91	30.52	36.87	31.63	33.86
Ours	35.99	26.44	35.61	38.15	36.24	30.67	36.72	31.84	33.95

Table 5. SSIM score of the Synthetic Blender dataset.

Dataset\|Method	Chair	Drums	Ficus	Hotdog	Lego	Materials	Mic	Ship	Avg
Plenoxels [9]	0.97630	0.93212	0.97536	0.97895	0.97214	0.94564	0.98368	0.88649	0.95633
InstantNGP [11]	0.98533	0.95241	0.98492	0.98367	0.98135	0.95214	0.98829	0.89739	0.96568
Mip-NeRF360 [35]	0.98285	0.93879	0.97924	0.98418	0.98169	0.96012	0.99089	0.90047	0.96477
3DGS [7]	0.98766	0.95483	0.98698	0.98528	0.98255	0.96031	0.99252	0.90623	0.96954
MS-3DGS [12]	0.98532	0.95514	0.94841	0.98495	0.97792	0.96016	0.99137	0.90368	0.96336
Mip-Splatting [4]	0.98791	0.95543	0.98786	0.98527	0.98208	0.96044	0.99268	0.90618	0.96973
Ours	0.98825	0.96031	0.98770	0.98541	0.98298	0.96053	0.99250	0.90758	0.97079

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, F.; Wang, Y.; Yi, B.; Ma, J. Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting. Mathematics 2026, 14, 1197. https://doi.org/10.3390/math14071197

AMA Style

Zhang F, Wang Y, Yi B, Ma J. Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting. Mathematics. 2026; 14(7):1197. https://doi.org/10.3390/math14071197

Chicago/Turabian Style

Zhang, Fei, Yinghui Wang, Bo Yi, and Jiaxin Ma. 2026. "Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting" Mathematics 14, no. 7: 1197. https://doi.org/10.3390/math14071197

APA Style

Zhang, F., Wang, Y., Yi, B., & Ma, J. (2026). Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting. Mathematics, 14(7), 1197. https://doi.org/10.3390/math14071197

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Deletion of Gaussian Ellipsoids in 3D Gaussian Splatting

Abstract

1. Introduction

2. Related Work

2.1. Core Innovations of 3DGS

2.2. Mitigation of Over/Under-Reconstruction

2.3. Artifact Reduction Approaches

3. Method Overview

4. Method Detailed Description

4.1. Dynamic Scale Control

4.2. Dynamic Transparency Control

4.3. Loss Function

5. Experimental Analysis

5.1. Selection of the Dataset

5.2. Selection of Evaluation Criteria

5.3. Ablation Experiment Verification

5.4. Experimental Comparison

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI