Late days used: 2
Torus rendered using sphere tracing:
For each ray, the algorithm initializes a t
value (based on the near point provided in the intialization of the SphereTracingRenderer
), where t
is a scaler in the origin + t * direction
ray/point parameterization. At every iteration, the algorithm increments t
by the SDF value at the point origin + t * direction
. The algorithm also maintains a mask tensor which keeps a track of which rays do intersect with a surface and which ones didn't. When a point on a cast ray comes close (closeness
defined by a thresold, here 0.005) enough to a surface, the mask is update to reflect the contact. Once the provided maximum number of iterations are performed, the scene is rendered using the obtained points and masks.
Input point cloud which was used for training:
Rendered bunny using the defined neural SDF:
Reference rendering of the bunny
The MLP consists of an initial set of 6 FC layers of size 128, each followed by a ReLU activation. The output of this initial set of layers is a 128-dim vector which is fed into another dist_layer
(see line 346 in a4/implicit.py under NeuralSurface
class) which outputs a single distance value for each point. This layer consists of a single FC layer (with no activation) which projects the input 128-d vector to a single value. I also use harmonic embeddings for the input points, as in the NeRF paper.
The eikonal loss computes the norm of the gradients and enforces them to be as close to 1 as possible. I use a L2 loss on the difference between the norm of the gradients and the value 1.
The model was trained for 5k epochs with a batch size of 4096.
The above model and loss produces a better rendering of the bunny in my opinion.
Generated geometry of the lego data:
Generated coloured rendering of the lego data:
For this part, I build up on top of the MLP I used in the previous question. I use the 6 layered "initial" FC network to extract features common to the distance and colour predictions to save computation as suggested. The 128-d output of the "initial" FC network is fed into a couple of FC layers of sizes 128 and a Sigmoid activation (just at the end) which produce a normalized RGB value for each point.
$\alpha$ controls the output range of the density (essentially acting as a scaling factor for the density values) whereas $\beta$ controls how the density drops off as we move away from the surface (see below for explanation).
The following curves shows how the density distribution varies with the SDF values for different $\beta$ values. SDF values have been plotted on the x-axis and density values on the y-axis.
For $\beta$=0.1:
For $\beta$=0.5:
For $\beta$=1:
Clearly, as the value of $\beta$ increases, the density distribution becomes smoother.
How does high $\beta$ bias your learned SDF? What about low $\beta$?
sdf_to_density
function w.r.t the input signed distance would have $\beta$ in the denominator. Very small $\beta$ values could then lead to exploding gradients.I tried running the experiment with $\beta$ values in [0.005, 0.5].
For $\beta$ = 0.005, I obtain the following results:
Generated geometry of the lego data:
Generated coloured rendering of the lego data:
For $\beta$ = 0.5, I obtain the following results:
Generated geometry of the lego data:
Generated coloured rendering of the lego data:
$\beta$ = 0.05 seems to work the best. It apparently strikes the balance between the $\beta$ value not being too high or too low. $\beta$ = 0.5 doesn't produce a sharp and accurate rendering which is expected (follows from the discussion above about $\beta$ values). $\beta$ = 0.005 produces slightly better rendering of the scene, see the views below for comparison (left is $\beta$ = 0.005 and right is $\beta$ = 0.05)
But there are some artifacts in the views with $\beta$ = 0.005. Hence, $\beta$ = 0.05 seems to work the best.
For this part, I tried rendering 27 small spheres placed on the edges/corners of a 3D cube. The rendering of this scene using sphere tracing looks like:
I also tried rendering round boxes (only and mixed with spheres) instead, but it didn't look visually appealing for small boxes, and hence stuck with spheres instead.
For this question, I modified the SDF surface class in a4/implicit.py
(changes made in a duplicate class MultiSDFSurface
) to accept multiple SDFs during initialization. This class, MultiSDFSurface
returns the minimum of all the individual SDF values obtained from each individual SDF, and is rendered using the normal sphere tracing algorithm used for rendering the torus in Q1.
Note: The views have been randomly sampled for all the experiments stated below.
Training the VolSDF on lego data using 20 views yields the following rendering:
Using a 5 layer inital-network NeRF (which I used for assignment 3) and same number of views, I obtain the following rendering:
Interestingly, the NeRF based rendering is sharper (maybe because of the difference in hyper parameters and slight difference in the network configurations) than the SDF based network. However, the NeRF based representation has a kind of reflection of the lego in the lower part of the gif, which is not there in the SDF rendering.
Further reducing the views to 10 yields the following results:
VolSDF:
NeRF:
Also, reducing the views to 5 yields the following results:
VolSDF:
NeRF:
The qualtiy of the rendered scenes does degrade with number of views for each of the models (which is expected).
The geometry of NeRF is sharper for 10 and 5 views as well, but for 5 views, NeRF isn't able to generate consistent views from every direction (the GIF almost blanks out for some views) whereas VolSDF, although having a poorer geometric reconstruction, is able to generate consistent renderings from almost every direction.
Hence, for a consistent rendering, VolSDF does seem better than NeRF given a few number of views.
I used the 'naive' solution from the NeuS paper as an alternative for the equation from VolSDF paper to convert signed distanct to volumetric density. Setting s=50
, the function yields the following graph:
Here, the signed distance is plotted on the x-axis and the density is plotted on the y-axis.
Training the model using the above function and s value, I obtain the following result:
Generated geometry of the lego data:
Corresponding result from the VolSDF approach for comparison:
Generated coloured rendering of the lego data:
Corresponding result from the VolSDF approach for comparison:
This results from this alternative from the NeuS paper does produce comparable renderings (it rather seems to be slightly better), but has some artifacts which the VolSDF results don't have. But the NeuS approach produces a poorer geometry as compared to the VolSDF based approach.