16-889 Assignment 3: Neural Surfaces.

1. Sphere Tracing (30pts)

For sphere tracking, I first define two masks mask_hit and mask_gone. mask_hit keeps track of all rays that have had an intersection, while mask_gone is a mask for all rays that are already out of bound (distance >>  self.far). Then, I use a while loop condition to march the rays. At each iteration, I march the rays that are neither hit or gone starting from the origin by the amount f(p)+df(p) + d where ff is the signed distance function and dd is the direction. After marching, I update the two masks based on the condition of f(p)<ϵf(p) < \epsilon or norm(p)>\text{norm}(p) > self.far. The ending condition for the for loop is that the sum of mask_hit and mask_gone should be equal to the number of points (all points in the batch should be accounted for, either hit a surface of gone).

2. Optimizing a Neural SDF (30pts)

My MLP is a the MLPWithInputSkips with 6 layers in total and ReLU activation for the distance branch. Each layer has 128 hidden units. Skip is set at layer 3. I used the HarmonicEmbedding to project the input into a space that better captures the higher frequency information. I used L1 loss on the distance plus the eikonal loss which is implemented as:

((gradients.norm(dim=1) - 1)**2).mean()

to make sure that the norm of the gradient is equal to 1. Learning rate is set to 0.0001 lr_scheduler_gamma to 0.8, random seed 0, trained for 5000 epoches.

3. VolSDF (30 pts)

Experiments

alpha: 10 beta: 0.05 (default, best geometry quality)

alpha: 10 beta: 0.5

alpha: 10 beta: 0.005

alpha: 100 beta: 0.05 (best rendering quality)

alpha: 1 beta: 0.05

For this part, I vary the parameter beta and alpah. We can see that for the rendering quality, a combination of beta = 0.05 and alpha = 100 works the best. The geometry quality, beta = 0.05 and alpha = 10 works the best. This tradeoff can be attributed to the fact that we are learning an SDF, so when alpha become to big, though the rendering quality increases, the learned geometry is not as well-defined as with smaller alphas.


In your write-up, give an intuitive explanation of what the parameters alpha  and beta:

💡
beta controls the sensitivity of the density with respect to the distance function (i.e. the higher beta is, the smoother is the change of the density with respect to the distance function). alpha controls the scale of the output density. When beta is zero, the function become a scaled indicator function and the alpha multiplier defines the density inside the object.

1. How does high beta bias your learned SDF? What about low beta?

💡
Higher beta bias the SDF toward a smoother boundary, which will make the SDF smoother and less sensitive to the distance input. The learned SDF will be more blurry. Lower beta will make the SDF boundary shaper, and as betaapproaches 0 the density will converges to a scaled indicator function of the object surface.
  1. Would an SDF be easier to train with volume rendering and low beta or high beta? Why?
💡
The SDF is easier to train with volume rendering with higher beta. However, from the experiment result, we can see that a higher beta will result in a more blurry image and learned geometry. Since higher beta means the density is less sensitive to the SDF, more points around the surface will have non-zero density, resulting in a more smoother and more stable optimization problem. When beta is too big (e.g. 0.5), we can see that the geometry is too coarse. I also notice that higher value of alpha also results in a shaper image rendering, though the learned geometry is of lesser quality.
  1. Would you be more likely to learn an accurate surface with high beta or low beta? Why?
💡
We are more likely to learn an accurate surface with lower beta, since the lower the beta is, the shaper the surface will become. However, too low a beta might result in a much harder optimization problem where the network may not converge.

4. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)

4.1. Render a Large Scene with Sphere Tracing (10 pts)

I defined a scene with a unit sphere in between and torus, box, sphere scatter around, containing 24 primitives.

4.2 Fewer Training Views (10 pts)

Selected 20 and 10 frame indexes for training both nerf and NeuralSDF:

[81, 19, 46,  8, 75, 42, 12, 73, 79,  2, 72, 76, 20, 10, 28, 38, 54, 39,
        65, 25] # 20 views
[81, 19, 46,  8, 75, 42, 12, 73, 79,  2] # 10 views

Neural SDF (20 views):

Nerf (10 views):

Neural SDF (10 views):

Nerf (10 views):

We can see that the both nerf and neural sdf was able to learn a reasonable rendering of the bulldozer without only 20 and 10 views. We can also observe a visible downgrade in quality from 20 to 10 views in both methods. Especially with only 10 views, we can see that NeuralSDF has less artifacts when rendered into an image (e.g. Nerf results in more artifacts at the back of the bulldozer).

4.3 Alternate SDF to Density Conversions (10 pts)

I implemented the “naive” solution in the NeuS paper.

The density is defined as the S-density where:

ϕs(x)=sesx/(1+esx)2\phi_{s}(x)=s e^{-s x} /\left(1+e^{-s x}\right)^{2}

ss is then distance from the SDF network.

I tuned different s to find the best configuration:

s = 1:

s = 10:

s = 50:

s = 100:

Network loss is Nan.

We can see that the “naive” solution in the NeuS paper yields worse result than part 3. The network is also less stable and sensitive to s. It is conceivable that the result can be better with more hyperparameter tuning, but the rendering result with s = 50 is reasonably good.