Assignment 2 : Learning for 3D Vision

Abhinav Agarwalla (AndrewID: aa4)

Q1.3

xy grid rays

Q1.4

sampling visualization

Q1.5

Box rendering Depth

Q2.1

Code implementation only

Q2.2

Quantity Measurements
Box Center (0.25, 0.25, 0.00)
Box Side Lengths (2.00, 1.50, 1.50)

Q2.3

Q3 NeRF

NeRF rendering without view dependence

Q4.1 View Dependence

In this section, we add view-dependency to the NeRF MLP. For this, we utilize the config named `nerf_lego_view_dir.yaml'.

w/ view dir w/o view dir

We visualize the pixel wise difference in the table below.

w/ view dir w/o view dir diff

Trade-offs: Intuitively, an increased view-dependence directly translates to the requirement of more images. In order to prevent the network from overfitting, we would need more images with different views to regularize the network. Otherwise, an increase dependence would lead to overfitting on the captured views and hence lower the generalization ability of the model on novel views. For this exact reason, only a small 1-2 layer MLP is trained additionally when using view-dependency.

Q4.2 Hierarchical Sampling

As expected, using a hierarchical sampling strategy increases the run time of NeRF. We report the speed as number of iterations per second in the table below. We observe that heirarchical sampling decreases speed by 2.5x.

Uniform Sampling Hierarchical Sampling
20.12it/s 7.74it/s

In terms of quality, using hierarchical sampling leads to fine-grained features and more color variation. In certain views, the edges are clearer and there is more variation as observed on the brown coloured-lines on the platform that the lego toy is kept on.

Uniform Hierarchical
Uniform Hierarchical diff

Q4.3 High Resolution Imagery

For the high resolution imagery, a simplified model was utilized due to time constraints. The high resolution generation takes a lot of time to train, with rendering being the key time consuming component. Overall, we begin to see high frequency details being captured (like small squares on lego board) as use high resolution images with hierarchical sampling, and increasing network capacity and number of sampled points along a ray.

Parameters Obtained Result
Default, w/o view-dependence, uniform
Default, w/o view-dependence, hierarchical
Default, w/o view-dependence, hierarchical, 196 sampling points, hidden dim=256

Another interesting off-topic thing I noticed is that a simplified-model with 6-layers, leads to some reflection-like artifacts on the bottom of the image. This artifact is not present in the 8-layer model, or with hierarchical sampling.