3DGS源码解读 - 自适应高斯密度控制_3dgs源码解析
3DGS 的自适应密度控制是其核心优化机制之一,用于动态调整场景中高斯点的分布和数量,以平衡重建精度与计算效率,整体流程图如下所示:
图片来源于文献:3D Gaussian Splatting for Real-Time Radiance Field Rendering
1. 为什么要进行自适应高斯密度控制
自适应密度控制的原因是:某一些高斯点的空间位置(xyz)的梯度太大。为什么要对这些空间位置梯度较大的高斯点进行优化呢?
xyz 的梯度反映了高斯点位置对损失函数(渲染图像与真实图像的误差)的影响程度,梯度越大表明高斯点的位置可能偏离了真实场景的几何结构,主要体现在两个方面:
- 大梯度可能源于渲染图像与真实图像的像素级差异(如颜色、亮度不匹配);
- 大梯度也可能反映结构相似性(SSIM)损失较高,说明局部几何结构(如边缘、纹理)未被正确重建。
论文中将这些梯度较大的高斯点分为两大类:欠重构(underreconstruction)和过重构(over-reconstruction)。
- 欠重构高斯点的特点是:高斯椭球尺寸太小,导致无法有效覆盖目标区域;
- 过重构高斯点的特点是:高斯椭球尺寸太大,导致过度覆盖目标区域。
2. 自适应高斯密度控制的策略
论文中采用了三个策略对高斯点进行动态调整,分别为:
- 分裂:当梯度大且高斯点尺寸过大时,需要将一个大高斯点拆分为多个小高斯点,以覆盖复杂几何;
- 克隆:当梯度大但高斯点尺寸较小时,可能通过克隆高斯点,以填充未重建区域;
- 剪枝:若梯度大但透明度低,可能通过剪枝移除冗余点,避免资源浪费。
下面对上述三种策略结合代码进行分析:
2.1. 分裂
论文中关于分裂的说明如下:
On the other hand, large Gaussians in regions with high variance need to be split into smaller Gaussians. We replace such Gaussians by two new ones, and divide their scale by a factor of 𝜙 = 1.6 which we determined experimentally. We also initialize their position by using the original 3D Gaussian as a PDF for sampling.
实现代码如下所示:
def densify_and_split(self, grads, grad_threshold, scene_extent, N=2): # N=2 表示每个高斯点分裂为 2 个新点 n_init_points = self.get_xyz.shape[0] # Extract points that satisfy the gradient condition # 确保 grads 与当前高斯点数量 n_init_points 对齐 padded_grad = torch.zeros((n_init_points), device=\"cuda\") padded_grad[:grads.shape[0]] = grads.squeeze() # 筛选出梯度大且尺寸较大的高斯点(scaling 表示高斯点的尺寸),percent_dense 默认为 0.01 selected_pts_mask = torch.where(padded_grad >= grad_threshold, True, False) selected_pts_mask = torch.logical_and( selected_pts_mask, torch.max(self.get_scaling, dim=1).values > self.percent_dense * scene_extent ) # 从均值为 0、标准差为 stds 的正态分布中采样,作为分裂后新点的位置偏移 # 对应于论文中的 \"We also initialize their position by using the original 3D Gaussian as a PDF for sampling.\" stds = self.get_scaling[selected_pts_mask].repeat(N, 1) means = torch.zeros((stds.size(0), 3), device=\"cuda\") samples = torch.normal(mean=means, std=stds) rots = build_rotation(self._rotation[selected_pts_mask]).repeat(N, 1, 1) new_xyz = torch.bmm(rots, samples.unsqueeze(-1)).squeeze(-1) + self.get_xyz[selected_pts_mask].repeat(N, 1) # 将原点的缩放值除以 0.8 * N,确保分裂后的新点尺寸更小 # 对应于论文中的 \"divide their scale by a factor of 𝜙 = 1.6 which we determined experimentally\" new_scaling = self.scaling_inverse_activation(self.get_scaling[selected_pts_mask].repeat(N, 1) / (0.8 * N)) new_rotation = self._rotation[selected_pts_mask].repeat(N, 1) new_features_dc = self._features_dc[selected_pts_mask].repeat(N, 1, 1) new_features_rest = self._features_rest[selected_pts_mask].repeat(N, 1, 1) new_opacity = self._opacity[selected_pts_mask].repeat(N, 1) new_tmp_radii = self.tmp_radii[selected_pts_mask].repeat(N) self.densification_postfix( new_xyz, new_features_dc, new_features_rest, new_opacity, new_scaling, new_rotation, new_tmp_radii ) # 移除被分裂的原点,避免冗余(分裂后新点已覆盖原点的功能) # selected_pts_mask 为原点的掩码,新点数量为 N * sum(selected_pts_mask),新点掩码全部赋值为 0 prune_filter = torch.cat( (selected_pts_mask, torch.zeros(N * selected_pts_mask.sum(), device=\"cuda\", dtype=bool)) ) # 在 prune_point 中会将 prune_filter 进行取反(~prune_filter),即被保留的点为 True,要删除的点为 False self.prune_points(prune_filter)
2.2. 克隆
论文中关于克隆的说明如下:
For small Gaussians that are in under-reconstructed regions, we need to cover the new geometry that must be created. For this, it is preferable to clone the Gaussians, by simply creating a copy of the same size, and moving it in the direction of the positional gradient.
对梯度较大且尺寸较小的高斯点进行简单地克隆,复制位置、颜色、尺度、旋转等信息即可,实现代码如下所示:
def densify_and_clone(self, grads, grad_threshold, scene_extent): \"\"\" :param grads: 每个高斯点的位置梯度(xyz 梯度),形状为 (N, 3),其中 N 是高斯点数量 :param grad_threshold: 梯度阈值,用于判断是否需要克隆。默认值为 0.0002 :param scene_extent: 场景包围球的半径,其作用是确保所有相机在世界坐标系中的中心位置都被包含在一个球体内 :return: \"\"\" # 计算每个高斯点的梯度范数(L2 范数),并筛选出梯度大于等于阈值的点,grad_threshold 默认为 0.0002 selected_pts_mask = torch.where(torch.norm(grads, dim=-1) >= grad_threshold, True, False) # 筛选出梯度大且尺寸较小的高斯点(scaling 表示高斯点的尺寸),percent_dense 默认为 0.01 selected_pts_mask = torch.logical_and( selected_pts_mask, torch.max(self.get_scaling, dim=1).values <= self.percent_dense * scene_extent ) # 克隆高斯点,位置(xyz)和其他参数(颜色、透明度、缩放、旋转、半径等)等信息直接复制即可 new_xyz = self._xyz[selected_pts_mask] new_features_dc = self._features_dc[selected_pts_mask] new_features_rest = self._features_rest[selected_pts_mask] new_opacities = self._opacity[selected_pts_mask] new_scaling = self._scaling[selected_pts_mask] new_rotation = self._rotation[selected_pts_mask] new_tmp_radii = self.tmp_radii[selected_pts_mask] # 将克隆生成的新高斯点合并到现有高斯模型中 self.densification_postfix( new_xyz, new_features_dc, new_features_rest, new_opacities, new_scaling, new_rotation, new_tmp_radii )
论文中说将克隆的高斯点的位置移动到位置梯度的方向(moving it in the direction of the positional gradient),但是代码中只是简单的复制了一下位置信息,并没有移动位置?
参考 issues:
1)operations on small Gaussians #217:
2)Questions Regarding Experimental Settings #82:
根据上面两个 issue 的解释,总结起来就是:因为克隆点的梯度为零,而原始点保持在最后一步中计算的梯度。在优化器的下一步中,其中一个点将朝着梯度的方向移动。
2.3. 剪枝
论文中关于剪枝的说明如下:
The optimization then increases the 𝛼 for the Gaussians where this is needed while allowing our culling approach to remove Gaussians with 𝛼 less than 𝜖_𝛼 as described above. Gaussians may shrink or grow and considerably overlap with others, but we periodically remove Gaussians that are very large in worldspace and those that have a big footprint in viewspace.
# 删除透明度小于 min_opacity(0.005)以及半径和尺寸过大的高斯点prune_mask = (self.get_opacity < min_opacity).squeeze()if max_screen_size: big_points_vs = self.max_radii2D > max_screen_size big_points_ws = self.get_scaling.max(dim=1).values > 0.1 * extent prune_mask = torch.logical_or(torch.logical_or(prune_mask, big_points_vs), big_points_ws)self.prune_points(prune_mask)
3. 定期重置透明度 α \\alpha α
Similar to other volumetric representations, our optimization can get stuck with floaters close to the input cameras; in our case this may result in an unjustified increase in the Gaussian density. An effective way to moderate the increase in the number of Gaussians is to set the 𝛼 value close to zero every 𝑁 = 3000 iterations.
在优化过程中,每 3000 次迭代,将所有高斯点的透明度 α \\alpha α 重置为接近 0(代码中是重置为小于等于 0.01 的值)。透明度的取值范围为 [0,1] [0, 1] [0,1],0 表示完全透明,1 表示完全不透明。
实现代码如下:
# self.opacity_reset_interval 默认为 3000if iteration % opt.opacity_reset_interval == 0 or (dataset.white_background and iteration == opt.densify_from_iter): gaussians.reset_opacity()def reset_opacity(self):# 重置后的透明度 <= 0.01 opacities_new = self.inverse_opacity_activation( torch.min(self.get_opacity, torch.ones_like(self.get_opacity) * 0.01) ) optimizable_tensors = self.replace_tensor_to_optimizer(opacities_new, \"opacity\") self._opacity = optimizable_tensors[\"opacity\"]
定期重置 α \\alpha α 的作用是:抑制浮点问题(如相机附近的浮点问题),促使优化过程重新分配高斯点分布。
在 3DGS 的优化过程中,浮点问题主要表现为:
- “Floaters” 现象:高斯点在输入相机附近聚集,形成虚假几何体(如漂浮的点云)。高斯点在相机附近对重建误差更敏感,容易触发分裂/克隆操作;
- 高斯密度异常增长:由于优化算法不断分裂或克隆高斯点,导致模型密度无限制增加,影响性能和稳定性。
每 3000 次迭代将高斯点的透明度设为接近 0,迫使低贡献高斯点被剪枝。