1. Differentiable Volume Rendering
1.3. Ray sampling (10 points)
Figure layout:
For the following, the
left figures are the
TA outputs for grid and rays visualizations. The figures on the
right represents the outputs generated by
our implementation.
Reproducibility:
Please execute
./run_q1.sh
and the above figure would be automatically saved as
results/grid_ours.png
and
results/rays_ours.png
1.4. Point sampling (10 points)
Figure layout:
Left figure is the
TA output.
Right figure is
our output.
Reproducibility:
Please execute
./run_q1.sh
and the above figure would be automatically saved as
results/sample_points_ours.png
.
1.5. Volume rendering (30 points)
Reproducibility:
The gif on the
left would be stored in
images/part_1.gif
when
./run_q1.sh
is executed. The visualization of the
depth from our implementation is shown on the
right.
2. Optimizing a basic implicit volume
2.2. Loss and training (5 points)
The center of the box, and the side lengths of the box after training, rounded to the nearest
1/100
decimal place is
2.3. Visualization
Figure layout:
Left figure is the
TA output.
Right figure is
our output.
Reproducibility:
Please execute
./run_q2.sh
to print out the box parameters as well as save the above gif at
images/part_2.gif
.
3. Optimizing a Neural Radiance Field (NeRF) (30 points)
The NeRF architecture implemented here is similar to the one proposed in the original paper by Mildenhall et al. (2020).
At a high level, the architecture is as follows:
MLP --> Linear, ReLU --> Linear --> Linear, ReLU --> Linear, Sigmoid
The concatanation of the vector implies the implementation of view-dependence.
Lego visualization:
The spiral rendering of the bulldozer lego before (
epoch #10) and after training (
epoch #250) are shown below on the
left and
right, respectively.
Reproducibility:
Please execute
./run_q3.sh
to generate the lego bulldozer gif shown above and save the gif at
images/part_3.gif
.
4. NeRF Extras
4.1 View Dependence (10 pts)
The view dependence is implemented according to the architecture proposed by Mildenhall et al. (2020). For view_dependence, we append the feature vector with the positional encoding of the input viewing direction. In order to run this part, the feature view dependence could be enabled by passing the argument of
view_dependence: True
in the config files under the field of
implicit
. In the current submission, the
view_dependence
argument is added in the
nerf_lego
config file. This part could be run by executing
./run_q3.sh
with the above argument.
View dependence visualization:
The spiral rendering of the bulldozer lego
without and
with view dependence feature is shown below on the
left and
right, respectively.
Trade-offs between increased view dependence and generalization quality.
Benefits: For the exact same config setup like the one we had in Question 3 (less parameters for viewing direction and more parameters for volumetric xyz), with the only exception of adding the view-dependene architecture, visually the view dependence lego bulldozer appeared more crisp and bit a more clearer. This confirms the efficacy of the added architetcture where the concatenated feature vector starts relying on input viewing direction as well. However, there is a tipping point beyond which we start seeing the shortcomings of view dependency.
Shortcomings: I noticed that if I increase the model parameters assocaited with the view dependency, there seems to be an inverse relationship with the view dependency and generalization quality of NeRF. With the increased view dependency, the model seems to overfit to the training images' view dependent effects. Thus, from the above experimentation, it was concluded that in order to leverage the best benefits of view dependency, we should keep their parameters in minimal amount. The reason being - since the volumetric geometry has far richer details compared to the view directional effects, it is best to keep this fact into consideration while setting the parameters of the NeRF model.
4.3 High Resolution Imagery (10 pts)
The following GIFs were obtained for the changed configurations. The
left GIFs is the baseline lego model (without high-resolution configurations) and the
right GIFs are the high-resolution imagery lego models with the changed configurations stated as following :
Config 1: n_pts_per_ray: 256
Config 2: n_layers_xyz: 9