I visualize the grid/ray visualization as below:
I visualize the point samples from the first camera as below:
I visualize part_1.gif
and the depth as below:
The code can be found in in ray_utils.py
.
The center of the box is (0.25, 0.25, 0.00)
The side lengths of the box is (2.00, 1.50, 1.50)
The gif is as below:
The result is:
I add view dependence to the NeRF model as mentioned in NeRF paper that means color is depend on position and view but density is independent on view. The result is as below:
Adding view dependence in my model will increase training time (from ~150s to ~600s) and will improve the quality of output color for each postion (although it is not obvious in this low resolution gif). But increased view dependence may make the model tend to overfit to the view direction of training images and hurt the generalization ability of rendering novel viewpoints.
I implement the a coarse network and a fine network as the NeRF paper proposed. The result is as below:
It can impove the rendering quality (although it is not obvious in this low resolution gif) but it also need more time to train and render. It takes ~150s without hierarchical sampling but it takes ~1400s using a coarse network and a fine network with hierarchical sampling. So quality improvement comes at the expense of speed.
I render the lego with 128, 256, 512 (top to bottom, respectively) point samples per ray and results are as below. We can see the more point samples per ray can have better results although needs more time (1064s, 1478s and 2607s).
I also render the lego with hidden neurons as 128 and 256 (top to bottom, respectively) to test different network capacities and the results are as below. We can see the more network capacities may have better results but needs more time (1064s, 1322s).