Neeraj Basu (neerajb)
Zero late days used.
To implement sphere tracing, I followed the equation provided in the slides, where f(p) was the distance from a sampled point to the surface of the torus. Since there was not an epsilon provided, I ended up using an epsilon of 1e-5 and got good results. This loop essentially takes a step along each ray until it hits the surface of the torus or is out of the scene (when max_iters is reached).
My MLP looked very similar to my implementation during NeRF. Although in this case I ended up using two seperate networks for distance and color instead of sharing the initial 6-layer MLP. Instead both networks start with the points passed through a 6-layer MLP, and that output is passed through a 1 and 3 feature output for distance and color respectively. Prior to being passed through the 6-layer MLP, the points are passed through the provided harmonic embedding function which significantly improved results. For the Eikonal loss, the goal was to promote the norm of the distances to be close to zero. So my loss measures the difference between the distance and a vectors of zeros and also takes the MSE of the Eikonal gradients. This loss also helped promote a rational distance function for points that don't fall on the surface. This keeps the net from producing arbitrary values for non-surface points. The rendering above was the result of training for 3000 epochs, on a 6-layer MLP with hidden layers of size 128.
1. The alpha value represents the maximum value of the density once inside the surface of the object. In other words the density can be between 0 and alpha and beta determine the rate of smoothing near the boundary transition. A low beta will result in a sharp increase of density from the outside to inside transition of a surface, similar to a step function. Whereas a high beta will result in a smoother transition and also a more blurred rendering as the SDF will need to learn a higher number of values.
2. Training with a higher beta would make the training process easier but with a tradeoff. A higher beta will also learn density values for points farther away from the surface and lead to a blurrier rendering. Whereas a lower beta will force the net to create density values for points on the surface, leading to a more refined rendering. Therefore a more forgiving (higher) beta will make training easier whereas a lower beta will require more training data for accurate results. The ideal thing to do is to make beta a learned parameter as proposed in the paper.
3. For a more accurate rendering, it's reccomended to use a lower beta to only enfore density values for points closer to the surface, but this comes with the tradeoffs discussed above.
The results above were produced with an alpha of 10.0, a beta of 0.08, 128 n_pts_per_ray and 250 epochs.
In order to render a scene with multiple primitives, I relied on the union technique discussed in lecture 2. I took the minimum of all of the SDF values returned for each primitive as those points would be closest to the camera during rendering. I had to modify the rendering function and add extra parameters to the sphere class in order to define the centers of each spehere. My rendering has 21 sphere primitives visualized.
To produce this rendering, I took a random sample of 30 indices from the train_idx variable before training time. This significantly reduced the fidelity of the rendering as multiple views were now unseen during training time. As you can see with a beta of 0.05 there are more artifacts and worse rendering during specific viewpoints. Although the geometry rendering stayed relativley the same. It's also important to note VoISDF failed to render with a subset of 20 images which is why I ended up choosing 30 for both. I was able to get a decent rendering with 20 images by increasing beta but the results were not great. I found the best results were when I increased my beta value from 0.05 to 0.08 with a subset of 30 images. When compared to a NeRF image trained with only 30 views, NeRF seems relativley uneffected compared to VoISDF.
I ended up implementing the naive solution from the NeuS paper which was also known as the logistic density distribution. I had to play around with the learned variable 's' proposed in the paper and ended up getting my best results when this value was at 20. I also had to increase my learning rate from .0005 to .001 to get higher fidelity results. It's clear this version of SDF -> density is not as effective as the version implemented in question 3 as there are more artifacts, especially on the underside of the truck for the colored rendering and the geometry rendering as a whole.