16-889 Assignment 4
Name: Adithya Sampath
Andrew ID: adithyas
Late days used:

1. Sphere Tracing (30pts)
Visualization
You can run the code for part 1 with:
# mkdir images (uncomment when running for the first time)
python -m a4.main --config-name=torus
By default, the results will be written out to part_1.gif
in the images
folder.
Results
Feature | My Output |
---|---|
Sphere Tracing | ![]() |
Implementation
We're given origins and directions as input. So I initialise the initial points with the origins provided in the input. Then as desscribed in the lecture, I traverse along each of the directions to find the points along them which have the least distance to a surface (i.e closest to the surface). Implicit function provides the signed distances as output when provided a set of points as input. So the steps are as follows:
- Define a (small enough) threshold epsilon.
- Define initial points as the origins
- Iterate as long as either (a) any of the distance outputs from implicit function are > epsilon or (b) num_iters < max iterations
- As described in the slides, update t <- t + implicit_fn(points)
- Update points <- origins + t * directions
- num_iters++
- We get the mask by (a) getting the outputs of the implicit_fn on the points we get from the above step (b) and checking which of these distances are lesser then epsilon.
2. Optimizing a Neural SDF (30pts)
Visualization
You can run the code for part 2 with:
python -m a4.main --config-name=points
By default, the results will be written out to part_2_input.gif
and part_2.gif
in the images
folder.
Results
Parameters | My Output |
---|---|
n_layers_distance=6 n_hidden_neurons_distance=128 append_distance=[3] |
![]() |
n_layers_distance=6 n_hidden_neurons_distance=192 append_distance=[3] |
![]() |
n_layers_distance=6 n_hidden_neurons_distance=256 append_distance=[3] |
![]() |
n_layers_distance=8 n_hidden_neurons_distance=128 append_distance=[4] |
![]() |
n_layers_distance=8 n_hidden_neurons_distance=192 append_distance=[4] |
![]() |
n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
Implementation
MLP
For the input, I use the HarmonicEmbedding class to get higher frequency positional encoding data. For the MLP to get the signed distances, I used an architecture very similar to NeRF in the previous assignment. I used the MLPWithInputSkips provided to define the MLP network. I experiments with 6 layers (with input skip at layer 3) and 8 layers (with input skip at layer 4),and also with 128, 192, 256 neurons in the Linear layers. After the MLP with skips, I have 2 Linear layers, to get the distance output. Unlike NeRF, we don't apply activations on the Linear layer output, since distances can be negative (unlike density). Emperically, I observed that increasing number of layers and/or increasing number of neurons in each Linear layer improved results.
Eikonal Loss
The Eikonal constraint tries to ensure that the norm of the gradients is as close to 1 as possible. It essentially encourages the gradients to be of unit 2-norm. So we try to minimise in a L1 loss manner (a) where I first find abs value of (2-norm of gradients - 1) (b) and then return the mean of the above output as the loss.
3. VolSDF (30 pts)
Visualization
You can run the code for part 3 with:
python -m a4.main --config-name=volsdf
By default, the results will be written out to part_3_geometry.gif
and part_3.gif
in the images
folder.
Results
Best results
Parameters | Output RGB | Output Geometry |
---|---|---|
beta=0.05 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
Experiments with changing Beta and keeping Alpha=10:
Parameters | Output RGB | Output Geometry |
---|---|---|
beta=0.0001 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.0005 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.001 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.005 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.01 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.05 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.1 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.5 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=1.0 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=2.0 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=5.0 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
Experiments with changing Alpha and keeping Beta=0.05:
Parameters | Output RGB | Output Geometry |
---|---|---|
beta=0.05 alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.05 alpha=50 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
beta=0.05 alpha=100 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
Comment on the settings you chose, and why they seem to work well.
Answer: Using positional encodings on the points input and using 8 Linear layers in the MLP with 256 neurons in each layer works well as shown in the previous question. For alpha and beta, empirically, from the above experiments with changing alpha and beta, I found out that alpha=10 and beta=0.05 provides good results. This explained using the below equation - the sdf to density curve should decrease at the surface (since sdf on the surface is 0). Intuitively, the density models a homogeneous object with a constant density alpha that smoothly decreases near the object’s boundary, where the smoothing amount is controlled by beta.
In your write-up, give an intuitive explanation of what the parameters alpha
and beta
are doing here. Also, answer the following questions:
1. How does high beta
bias your learned SDF? What about low beta
?
Answer: Beta controlls the smoothness of the transition from the surface boundary. A higher beta would cause the transitions to be extra smooth, whereas for smaller beta the transitions will be more steep. This is clearly especially from the geometry outputs (ex: beta=5) we can see that higher beta shows a higher density even for regions outside the object (since the density transitions are significantly smoothened out) and the outputs are smoothened out and extremely blurry. However for lower beta, we can see the features are more sharp and the model is able to capture the finer details of the object.
2. Would an SDF be easier to train with volume rendering and low beta
or high beta
? Why?
Answer: It would definitely be easier to train with higher beta. Although lower beta values provides sharp rendering results, very small values will cause the density to blow up (since beta is in the denominator of the exponent term in both cases of s<=0 amd s>0). Since beta controls the smoothness of the transitions at surfaces, for higher betas even the gradients will be smooth and model will be able to learn well. However, it's not good to use very high beta values as well, since the regions around the surface are treated as nearly the same, and we end up with overly smoothed blurry outputs.
3. Would you be more likely to learn an accurate surface with high beta
or low beta
? Why?
Answer: We would definitely learn accurate surfaces with lower beta as explained above. At the surface where sdf becomes 0, the density shift from surface to inside object is sharp. Hence, the model can capture finer details of the object more accurately. However, one must be careful to not use very low values of beta, since the exponential term might explore resulting in nan values.
4. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)
4.1. Render a Large Scene with Sphere Tracing (10 pts)
Implementation
I used the following shapes:
a. 1 large torus for the outline of the face
b. 2 torus for the eyes
c. 2 small torus for the outline of the eye balls (may not be distinctly visible due to similar color with eye balls)
d. 2 sphere for the eye balls
e. 2 spheres for the cheek blush
f. 5 cube boxes for the nose
g. 16 spheres for the smile on the lips (totally worth it)
I created a custom config file called large_scene.yaml
, which has the config for all the above shapes. I also created a new class called LargeSceneSDF
in implicit.py
to rendering all the above shapes. Based on the indices of the object, I assign the color. I have defined a color for each of the 30 objects.
Results
Keep smiling! Don't worry be happy :)
Experiment | My Output |
---|---|
v0 | ![]() |
v1 | ![]() |
v2 | ![]() |
v3 | ![]() |
v4 | ![]() |
v5 | ![]() |
4.2 Fewer Training Views (10 pts)
Implementation
Answer: I decreased the number of views in the dataset.py
and obtained the below results.
Results
Number of views | SDF | NeRF | SDF Geometry |
---|---|---|---|
100 | ![]() |
![]() |
![]() |
50 | ![]() |
![]() |
![]() |
20 | ![]() |
![]() |
![]() |
10 | ![]() |
![]() |
![]() |
5 | ![]() |
![]() |
![]() |
2 | ![]() |
![]() |
![]() |
Discussion
Answer: VolSDF and NeRF have comparable results when sufficient views are provided. For num_views = 20 or 50 or 100, the outputs are very similar. However, VolSDF performs clearly better when the number of views is reduced to 10 or 5 - we can observe that, for unknown views, NeRF outputs are really blurred and there are artifacts in the output. In fact, when num_views=2, NeRF isn't even able to produce a results, but the VolSDF model is able to capture the rough geometry of the object.
4.3 Alternate SDF to Density Conversions (10 pts)
Implementation
Answer: I implemented the above logistic density distribution equation described in the NeuS paper. x is the signed distances, and s is the scale factor (like the alpha value described above). I created a separate config file called volsdf_density.yaml
with a flag called alternate_density
to use this formula to estmate the density when set to True.
Best results
Parameters | Output RGB | Output Geometry |
---|---|---|
alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
Experiments with changing Beta and keeping Alpha=10:
Parameters | Output RGB | Output Geometry |
---|---|---|
alpha=200 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=100 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=50 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=20 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=10 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=5 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=2 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
alpha=1 n_layers_distance=8 n_hidden_neurons_distance=256 append_distance=[4] |
![]() |
![]() |
Discussion
Answer: Clearly alpha affects the output geometry. For large values, the geometry was sparse and had gaps. However, for small values of alpha (like alpha=1, 2) the results were were blurry, with lots of artifacts, and very smoothened out. Emperically, the best results were got for alpha=10.