Number of late days used: 0
You can run the code for part 1 with:
python -m a4.main --config-name=torus
Output |
---|
![]() |
Implementation Description |
---|
Sphere Tracing is a robust technique for rendering implicit surfaces using geometric distances i.e. ray tracing implicit surfaces. The main underlying idea is to walk along a ray until close to a surface or max number of iterations has been reached.
|
Run the following for part 2
python -m a4.main --config-name=points
Input Point Cloud | Output | |
---|---|---|
![]() | ![]() |
Implementation Description |
---|
I have used an MLP encoder similar to NeRF, with the exact same settings as in Assignment 3 with one additional linear layer for predicting distance to the nearest surface. The MLP consists of 6 linear layers with 128 hidden units in each layer. Default cfg provided:
To implement eikonal loss, for each point, I computed the L2-norm of the gradient at that point, took its absolute difference with 1 at each point. Finally, applied a mean reduction across all points. loss = torch.mean(torch.abs(torch.norm(gradients, dim=1) - 1)) I used an MSE loss Point cloud SDF loss on distances. |
Hyper parameter tuning: The default parameter worked great for me. However, I saw that with increased epochs (10k) the surface looked slightly tighter and better at some regions like ears, joints, folded legs etc. . The experiments with varied values of epochs and eikonal weight values are as follows:
Weight of Regularization | Output |
---|---|
eikonal_weight: 0.001 | ![]() |
eikonal_weight: 0.02 | ![]() |
eikonal_weight: 0.1 | ![]() |
eikonal_weight: 0.5 | ![]() |
Number of Epochs | Output |
---|---|
num_epochs: 1000 | ![]() |
num_epochs: 2500 | ![]() |
num_epochs: 5000 | ![]() |
num_epochs: 10000 | ![]() |
You can train a NeRF on the lego bulldozer dataset with
python -m a4.main --config-name=volsdf
We are trying to model the density using a transformation of a learnable Signed Distance Function (SDF) namely
σ(x) = αΨ_β (−d_Ω(x))
Where Ψ_β is a CDF with 0 mean and β scale (standard deviation)
In the original paper, alpha and beta are learnable parameters. In our case, we set these values. Intuitively, we are trying to model a homogeneous object with constant density α (i.e. alpha serves as a scale factor to spread out the per-point density) that smoothly decreases at the boundary, and this smoothness is controlled by the β factor. As beta approaches 0, the density function just becomes an α scaled indicator function of Ω. Given this understanding, let us answer the following questions:
1. How does high beta bias your learned SDF? What about low beta?
A high value of beta would make the smoothness too much resulting in less sharp a boundary. This will help with better generalization. While on the contrary, a lower beta makes the transition sharper and hence the boundary sharper too.
2. Would an SDF be easier to train with volume rendering and low beta or high beta? Why?
I think training would be easier "meaning converge to a somewhat reasonable geometry even if a bit over-smoothed, rather than a totally invalid rendering" with a higher value of beta as it does not overfit to the input while still gives a rather smooth output that maintains the geometry to a decent extent. However, the surface becomes less sharp and blobby and blurry as points even not very close to the surface are incorporated. A lower beta value on the other hand gives a sharper and tighter rendering.
3. Would you be more likely to learn an accurate surface with high beta or low beta? Why?
We would be more likely to learn an accurate surface with a lower value of beta as it decreases the variance and the smoothness, resulting in sharper, accurate surface. On the other hand, a high beta value will have blurriness, given the high standard deviation.
I tried multiple sets of alpha, beta and learning rate values. Here are the results for experiments carried out with default alpha and lr values and varied beta values:
Hyper parameter tuning:
Beta | Output Geometry | Output Color |
---|---|---|
beta: 0.02 | ![]() | ![]() |
beta: 0.05 | ![]() | ![]() |
beta: 0.5 | ![]() | ![]() |
The architecture I have implemented is very similar to NeRF implementation (with no view dependence).
We have two heads: one for color head and the other for sdf. The sdf head MLP implementation is as explained in the previous question. The color head is the output of the first 6 linear layers (the same backbone network as for the sdf head), we have 2 layers with 256 hidden neurons in each layer, with ReLU in between following that. After the last linear layer of the color head we have a sigmoid to output the (r,g,b) values.
There is no ReLU after the last linear layer at the sdf head because if we did then all the points which have negative signed distance will also have zero signed distance, which means that the points inside the surface also get signed distance value of 0.
By tuning different values of beta we see that the results are better for lower values of beta as discussed above. A beta = 0.05 seems to work the best off the experiments I carried out.
I have implemented a custom sdf that renders box sdfs in a fashion that renders a jenga tower at different stages of the game and colors accordingly. Start, intermediate and end stages.
Jenga Time!:
python -m a4.main --config-name=custom
Reference Image | Start | Steady | Fin |
---|---|---|---|
![]() | ![]() | ![]() | ![]() |
n='no_of_views'
on line number 127 in dataset.py and then run:
python -m a4.main --config-name=volsdf
Number of Views | VolSDF | NeRF |
---|---|---|
2 | ![]() | ![]() |
10 | ![]() | ![]() |
20 | ![]() | ![]() |
100 | ![]() | ![]() |
We can see that VolSDF performs really well even with less number of views in comparison to NeRF.The NeRF outputs are obviously much sharper when compared to the VolSDF outputs. I have randomly sampled 2, 10, 20 views from the 100 views in the train set (maybe a better way of sampling could also make the results better further). We can see that NeRF results are not great for the unseen views and hence the dimness in few views.
python -m a4.main --config-name=volsdf
I have used the sdf-density transform as shown in the NeUS paper to convert SDF to density. density = s * torch.exp(-s * signed_distance) / ((1 + torch.exp(-s * signed_distance))**2)
s value | Output Geometry | Output Color |
---|---|---|
10 | ![]() | ![]() |
50 | ![]() | ![]() |
200 | ![]() | ![]() |