Late days used: 1
vis_grid
output for xy_grid
:
vis_rays
output for ray_bundle
:
render_points
output for point_samples:
Depth output:
Colour output:
Please see code for get_random_pixels_from_image
method in ray_utils.py
for this part (No output figure/visualization has been asked for this part).
Centre of box:(0.25, 0.25, -0.00)
Side Lengths of box: (2.00, 1.50, 1.50)
(rounded off to 1/100 decimal place)
Generated gif:
Original gif provided in the assignment:
Generated gif:
Original gif provided in the assignment:
For the above, I have used the same network architecture as in the NeRF paper with the following changes:
Please refer to the code for full details of the architecture.
For this part, I've used a 3 layer initial-network (as compared to the 5 layer initial-network in part 3 of the question).
Using a 5 layer inital-network, I obtain the following results:
There doesn't seem to be a significant difference in the visualizations of the view-dependent and view-independent predictions with my architecture. It seems that there are some subtle differences, such as the view-dependent predictions seem to represent specularity better, and the view-independent predictions seem to be more sharper, but I won't call my observations conclusive. In my opinion, a more detailed analysis with more sample points and other objects is required before commenting conclusively. (The authors in the paper mention a difference in the metrics to compare the view-dependent and view-independent predictions, that seems to be a good direction).
Running the NeRF model with the default nerf_lego_highres.yaml
configuration with the same architecture as part 3, yields the following visualization:
For hyperparameter tuning, I tried changing the number of n_pts_per_ray
config parameter from 128 to 64, 256 and 512. 512 n_pts_per_ray
uses a lot of GPU memory, and trains very slowly. Hence, I could train it for only 25 epochs. It yields the following sub-optimized result:
For 64 n_pts_per_ray
, I get the following result:
For 256 n_pts_per_ray
, I get the following result:
Note that the visualization is considerabley sharper for 256 points per ray as compared to 64 points per ray and marginally sharper than the model with 128 points per ray.
I also tried changing the number of layers in the intial part of the network. Specifically, I changed the number of layers from 5 to 3 and 7.
For 3 layers, I obtain the following resutl:
For 7 layers, I obtain the following resutl:
Here also, the results are sharper for the 7 layer network as compared to the 3 layer network. We can conclude that the 7 layer network has more experessive power.