Assignment 4: Neural Surfaces

Name: Edward Li
Andrew ID: edwardli
Late Days Used: One

1. Sphere Tracing (30 pts)

I implement the sphere tracing algorithm presented in lecture. More concretely, I follow the following pseudocode for each point:

while f(p) > epsilon:
    t = t + f(p)
    p = x_0 + t * d

There are a few important implementation details. First, I vectorize the code, by only updating points where the mask is 0, and the point distance is not further than self.far. We also initialize the distances to self.near to save computation and to avoid computing the SDF in places that are invalid.

The mask is updated to 1 when a point reaches $f(p)\leq \epsilon$, which is set to $\epsilon = 1e-5$ in my implementation. Finally, we also restrict the number of iterations to self.max_iters.

This produces the image:

part_1

Run the code for this part with python -m a4.main --config-name=torus. A folder named images should be created before running this command.

2. Optimizing a Neural SDF (30 pts)

This part has two interesting parts: model definition and loss function.

First, the model I use is similar to the density estimation network for NeRF. I use MLPWithInputSkips for ease of implementation. I use the hyperparameters given in the VolSDF paper, which is a network with 8 layers, each with a hidden layer of 256 units. A skip connection is added to layer 4. We use 4 harmonic embedding functions. This creates a larger network than originally given in the points.yaml file.

Importantly, we use no activation after the final layer, as we would like our SDF to predict arbitrary distance values.

The remaining hyperparameters are all left at default. This means 5000 epochs, 1e-4 inintial learning rate, 4096 batch size, and 250 pretraiing iterations.

I use mean L1 loss for point loss ($\lvert f(p)\rvert$), and L1 Eikonal loss as well. This is just $\lvert \lVert \nabla f(p)\rVert - 1\rvert$. I use the default loss weights, which is inter_weight of 0.1, and eikonal_weight of 0.02. Intuitively, our Eikonal loss attempts to make our network predict an SDF, not just an implicit surface.

This produces the image:

part_2

Run the code for this part with python -m a4.main --config-name=points

3. VolSDF (30 pts)

First, we extend the NeuralSurface class to predict per-point color. I do this in a similar way as the VolSDF paper, but without passing the SDF normal to the color network to save computation time/memory.

Concretely, I extend the distance network to output both a distance and a 256-dimensional "global geometry feature $z$" as presented in the VolSDF paper (see section 3.5). I pass $z$ through a ReLU activation, then input it to the color network along with the harmonic embedded points.

The color network is another MLPWithInputSkips, outputting a 3-dimensional color vector, which is then passed through sigmoid.

I make the network big enough to fit into memory. This results in a 6-layer distance network, with a skip connection at layer 3, along with 256 hidden layer size. The color network is 2 layers, with a 128 hidden layer size. We also don't use a skip connection, as the network is so shallow.

We also implement the sdf_to_density function following the VolSDF paper, which results in:

$$\sigma(x) = \alpha\Psi_{\beta}(-d_{\Omega}(x))$$

$$\Psi_{\beta}(s)=\begin{cases} \frac{1}{2}\exp\left(\frac{s}{\beta}\right)&\text{if }s\leq 0 \\ 1-\frac{1}{2}\exp\left(-\frac{s}{\beta}\right)&\text{if }s>0 \end{cases}$$

Here, $\alpha$ is the density value inside the volume, and $\beta$ represents how smooth the transition between the inside/outside density values are. Intuitively, higher $\alpha$ means a more opaque volume. More importantly, higher $\beta$ means the falloff from $\alpha\to 0$ is more gradual/takes longer, while lower $\beta$ means the transition becomes more like a delta function.

Let's answer each of the questions:

  1. How does high $\beta$ bias your learned SDF? What about low $\beta$?

    Overall, high $\beta$ tends to smooth out the learned SDF. This is because small changes in the SDF don't result in major changes in the density, as the change is so gradual. Additionally, high $\beta$ tends to bias SDFs to be slightly shrunk, as we get positive density further away from the surface, which allows us to still generate reasonable-looking images with a smaller SDF.

  2. Would an SDF be easier to train with volume rendering and low $\beta$ or high $\beta$? Why?

    SDFs would be easier to train with high $\beta$. This is due to a few reasons. First, we get radiance contributions from more sampled points with a higher $\beta$, as our density takes longer to drop off. This means we get better gradients/the image can be formed more quickly, allowing the SDF to learn more quickly as well.

    Additionally, the VolSDF paper claims that we have lower opacity approximation error with high $\beta$ (see Lemma 2). In other words, our sampling error is lower with high $\beta$, resulting in less noisy renders which allows for better supervision. Unfortunately, the better supervision does not always mean better overall quality, as too much $\beta$ will just look oversmooth.

  3. Would you be more likely to learn an accurate surface with high $\beta$ or low $\beta$? Why?

    We are more likely to learn an accurate surface with low $\beta$ (but not too low!). High $\beta$ can result in renders that are blurry, due to density values dropping off too slowly. This means that our SDF can never learn to be very accurate, as the only source of supervision is L2 loss on a blurry image.

To train, I use the default hyperparamters, but vary $\beta=1e-5, 0.05, 0.3$. We get:

$\beta$ Geometry Render
1e-5 1e-5 Geo 1e-5 Render
0.05 0.05 Geo 0.05 Render
0.3 0.3 Geo 0.3 Geo

I found $\beta=0.05$ to work the best. In general, it seems like this is a good balance between too low and too high $\beta$. Intuitively, $\beta=1e-5$ is too low to get meaningful gradients, and we also run into high sampling error, as we uniformly sample, which might result in samples that are not actually very close to the surface being weighted highly. Because we don't have mask loss, this just fails to train completely. On the other hand $\beta=0.3$ is too high. While sampling error is lower, it is outweighed by the blurriness.

Run the code for this part with python -m a4.main --config-name=volsdf. Vary the beta parameter in this to reproduce the experiments shown above.

4. Neural Surface Extras

4.1. Render a Large Scene with Sphere Tracing (10 pts)

I implemented a Box Frame SDF, as well as a ComposedSDF class which takes the minimum of all the SDF values. Using these, I create a transparent box of 19 donuts:

donuts

Run this code with python -m a4.main --config-name=bonus1.

4.2. Fewer Training Views (10 pts)

I tried training with only 20 views on both VolSDF and NeRF. We get these results:

Views VolSDF NeRF
20 20 VolSDF 20 NeRF
100 100 VolSDF 100 NeRF

We can make a few conclusions here. First, we find that both VolSDF and NeRF have a significant quality loss with only 20 training images. In particular, images appear much more blurry, which is expected, as we are training for 5 times less iterations (as our epochs are 5 times shorter).

More interestingly, we'd like to look for differences between the low-image geometry inference of VolSDF and NeRF. Because of VolSDF's stronger geometric biases, we expect it to perform better in the 20 image case. Surprisingly, NeRF performs quite strongly even in the 20 image case. However, I think there are a few main places where we can see VolSDF perform better than NeRF.

First, the red lego on top of the model is represented poorly by NeRF with only 20 images, while it is much more distinct with VolSDF. This is likely because VolSDF requires a present surface to predict some color. This extra constraint results in better geometry than the low-density attempt by NeRF.

Additionally, we see that the NeRF model has some artifacting in between the black treads. However, VolSDF predicts the yellow color much more reliably, because it is forced to commit to a specific surface.

Train this by uncommenting lines 126/127 in dataset.py, then running python -m a4.main --config-name=volsdf.