1. Differentiable Volume Rendering

1.3. Ray sampling (10 points)

Figure layout:

For the following, the left figures are the TA outputs for grid and rays visualizations. The figures on the right represents the outputs generated by our implementation.

Target

Optimized

Target

Optimized

Reproducibility:

Please execute ./run_q1.sh and the above figure would be automatically saved as results/grid_ours.png and results/rays_ours.png

1.4. Point sampling (10 points)

Figure layout:

Left figure is the TA output.
Right figure is our output.
Target
Optimized

Reproducibility:

Please execute ./run_q1.sh and the above figure would be automatically saved as results/sample_points_ours.png.

1.5. Volume rendering (30 points)

Reproducibility:

The gif on the left would be stored in images/part_1.gif when ./run_q1.sh is executed. The visualization of the depth from our implementation is shown on the right.

Target
Optimized


2. Optimizing a basic implicit volume


2.2. Loss and training (5 points)

The center of the box, and the side lengths of the box after training, rounded to the nearest 1/100 decimal place is
Target

2.3. Visualization

Figure layout:

Left figure is the TA output.
Right figure is our output.
Target
Optimized

Reproducibility:

Please execute ./run_q2.sh to print out the box parameters as well as save the above gif at images/part_2.gif.

3. Optimizing a Neural Radiance Field (NeRF) (30 points)


The NeRF architecture implemented here is similar to the one proposed in the original paper by Mildenhall et al. (2020).
Target
At a high level, the architecture is as follows:
MLP --> Linear, ReLU --> Linear --> Linear, ReLU --> Linear, Sigmoid
The concatanation of the vector implies the implementation of view-dependence.

Lego visualization:

The spiral rendering of the bulldozer lego before (epoch #10) and after training (epoch #250) are shown below on the left and right, respectively.
Target
Optimized

Reproducibility:

Please execute ./run_q3.sh to generate the lego bulldozer gif shown above and save the gif at images/part_3.gif.

4. NeRF Extras


4.1 View Dependence (10 pts)

The view dependence is implemented according to the architecture proposed by Mildenhall et al. (2020). For view_dependence, we append the feature vector with the positional encoding of the input viewing direction. In order to run this part, the feature view dependence could be enabled by passing the argument of view_dependence: True in the config files under the field of implicit. In the current submission, the view_dependence argument is added in the nerf_lego config file. This part could be run by executing ./run_q3.sh with the above argument.


View dependence visualization:

The spiral rendering of the bulldozer lego without and with view dependence feature is shown below on the left and right, respectively.
Target
Optimized


Trade-offs between increased view dependence and generalization quality.

Benefits: For the exact same config setup like the one we had in Question 3 (less parameters for viewing direction and more parameters for volumetric xyz), with the only exception of adding the view-dependene architecture, visually the view dependence lego bulldozer appeared more crisp and bit a more clearer. This confirms the efficacy of the added architetcture where the concatenated feature vector starts relying on input viewing direction as well. However, there is a tipping point beyond which we start seeing the shortcomings of view dependency.

Shortcomings: I noticed that if I increase the model parameters assocaited with the view dependency, there seems to be an inverse relationship with the view dependency and generalization quality of NeRF. With the increased view dependency, the model seems to overfit to the training images' view dependent effects. Thus, from the above experimentation, it was concluded that in order to leverage the best benefits of view dependency, we should keep their parameters in minimal amount. The reason being - since the volumetric geometry has far richer details compared to the view directional effects, it is best to keep this fact into consideration while setting the parameters of the NeRF model.


4.3 High Resolution Imagery (10 pts)

The following GIFs were obtained for the changed configurations. The left GIFs is the baseline lego model (without high-resolution configurations) and the right GIFs are the high-resolution imagery lego models with the changed configurations stated as following :

Config 1: n_pts_per_ray: 256
Target
Optimized

Config 2: n_layers_xyz: 9
Target
Optimized