Assignment 3

Number of late days used: 0

1.3, 1.4, 1.5 Differentiable Volume Rendering (10 points), Point sampling (10 points) & Volume rendering (30 points)

You can run the code for part 1 with:

python main.py --config-name=box

Visualization

The outputs of visualization are in order as follows:

GridRaysPoint SamplingColorDepth
GridRaysSample pointspart_1 gifDepth

2.1, 2.2, 2.3 Random ray sampling (5 points), Loss and training (5 points), Visualization

Run the following for part 2

python main.py --config-name=train_box

Spiral Rendering of Part 2

The center of the box, and the side lengths of the box after training, rounded to the nearest 1/100 decimal : Box center: (0.25, 0.25, 0.00) Box side lengths: (2.00, 1.50, 1.50)

3. Optimizing a Neural Radiance Field (NeRF) (30 points)

You can train a NeRF on the lego bulldozer dataset with

python main.py --config-name=nerf_lego
and
python main.py --config-name=nerf_lego_highres
respectively for low and high res.

NOTE: I have added an additional parameter to the config file, "use_direction" within the implicit_function block. This is set to False in this question as we are not using ray direction embeddings.

Result:

Spiral Rendering of Part 3

Spiral Rendering of Part 3 High res

4.1 View Dependence (10 pts)

Add view dependence to your NeRF model.

You can train a NeRF on the lego bulldozer dataset with "use_direction" set to True in the config files.

python main.py --config-name=nerf_lego
and
python main.py --config-name=nerf_lego_highres
respectively for low and high res.

Result:

Spiral Rendering of Part 3

Spiral Rendering of Part 3 High res

Though maybe not very evident for this example at this resolution and given the number of epochs trained for, adding view dependence results in a more realistic rendering. Ex: The specular details on the bulldozer's tyre thread, siren etc. as shown in these comparison frames (L- No view dependence R- with view dependence)

Spiral Rendering of Part 3

Spiral Rendering of Part 3 High res

Adding view dependence helps make the output look more realistic taking into account the view dependent features of the object being rendered. This is done by including ray direction embeddings in the color prediction flow as explained in the base paper. They restrict the network to predict the volume density as just a function of position, while allowing the color to be predicted as a function of both position and viewing direction. So this way we get a more realistic rendering. But that being said, had we increased the ray direction dependence by a large factor, it would have resulted in absurd, weird looking artefacts in the output 3D rendering, especially in those views that are not present at training time because of overfitting. This way, though view dependence to a certain extent is a great plus, increasing its weight beyond a certain value results in decreaseing the model's generalisation capabilities.

4.2 Hierarchical Sampling (10 pts)

Implemented the heirarchical sampling sampler class in sampler.py

4.3 High Resolution Imagery (10 pts)

Run NeRF on high-res imagery using the nerf_lego_highres.yaml

In addition to this, I also experimented with architecture and found these set of values to be giving the best results for me (Given the limited number of experiments carried out and the compute resources available to me)

n_harmonic_functions_xyz: 10 n_harmonic_functions_dir: 4 n_hidden_neurons_xyz: 256 n_hidden_neurons_dir: 128 density_noise_std: 0.0 n_layers_xyz: 8 append_xyz: [4]