Late Days

Assignment 3

Name: Ayush Pandey Andrew id: ayushp

1 Differentiable Volume Rendering

Ray sampling (10 points)

Take a look at the render_images function in main.py. It loops through a set of cameras, generates rays for each pixel on a camera, and renders these rays using a Model instance.

Implementation

Your first task is to implement:

  1. get_pixels_from_image in ray_utils.py and
  2. get_rays_from_pixels in ray_utils.py

which are used in render_images:

xy_grid = get_pixels_from_image(image_size, camera)
ray_bundle = get_rays_from_pixels(xy_grid, camera)

Visualization

You can run the code with:

python main.py --config-name=box

It will save two images in submission_outputs folder q1_3_1.png and q1_3_2.png

Grid Rays

Point sampling (10 points)

Implementation

Your next task is to fill out StratifiedSampler in sampler.py. Implement the forward method, which:

  1. Generates a set of distances between near and far and
  2. Uses these distances to sample points offset from ray origins (RayBundle.origins) along ray directions (RayBundle.directions).
  3. Stores the distances and sample points in RayBundle.sample_points and RayBundle.sample_lengths

Visualization

You can run the code with:

python main.py --config-name=box

It will save an image named q1_4.png in submission_outputs folder.

Sample points

Volume rendering (30 points)

Finally, we can implement volume rendering! With the configs/box.yaml configuration, we provide you with an SDFVolume instance describing a box. You can check out the code for this function in implicit.py, which converts a signed distance function into a volume. If you want, you can even implement your own SDFVolume classes by creating new signed distance function class, and adding it to sdf_dict in implicit.py. Take a look at this great web page for formulas for some simple/complex SDFs.

Implementation

You will implement

  1. VolumeRenderer._compute_weights and
  2. VolumeRenderer._aggregate.
  3. You will also modify the VolumeRenderer.forward method to render a depth map in addition to color from a volume

Visualization

You can run the code with:

python main.py --config-name=box

It will generate part_1.gif and depth.png in submission_output folder

Spiral Rendering of Part 1 Spiral Rendering of Part 1

2. Optimizing a basic implicit volume

Box center: (0.25661855936050415, 0.25521400570869446, 0.19203484058380127)
Box side lengths: (1.9380567073822021, 1.4644564390182495, 1.9106501340866089)

Visualization

You can run the code with:

python main.py --config-name=train_box

The code renders a spiral sequence of the optimized volume in submission_outputs/part_2.gif.

Spiral Rendering of Part 2

3. Optimizing a Neural Radiance Field (NeRF) (30 points)

In this part, you will implement an implicit volume as a Multi-Layer Perceptron (MLP) in the NeuraRadianceField class in implicit.py. This MLP should map 3D position to volume density and color. Specifically:

  1. Your MLP should take in a RayBundle object in its forward method, and produce color and density for each sample point in the RayBundle.
  2. You should also fill out the loss in train_nerf in the main.py file.

Visualization

You can train a NeRF on the lego bulldozer dataset with

python main.py --config-name=nerf_lego

It will train a NeRF for 250 epochs on 128x128 images. After training, a spiral rendering will be written to submission_outputs/part_3.gif.

Spiral Rendering of Part 3

4. NeRF Extras (Choose at least one! More than one is extra credit)

4.1 View Dependence (10 pts)

Add view dependence to your NeRF model! Specifically, make it so that emission can vary with viewing direction. You can NeRF or other papers for how to do this effectively --- if you're not careful, your network may overfit to the training images. Discuss the trade-offs between increased view dependence and generalization quality.

You can run the code with:

python main.py --config-name=nerf_lego_view

It will save the view dependent output to submission_outputs/part_4_1.gif

Answer: As Professor Shubham mentioned in the lecture, having omega fed in later in the network adds an inductive bias that the color does not get affected that much by view. We cannot feed omega earlier than when we are predicting density as density is not dependent on viewing direction. However keeping density aside, we don't want the network to overfit to the viewing direction. As the color at a point varies very slightly with the viewing direction we dont want to feed omega earlier in the network and allow it to cheat (or overfit) using the omega. By feeding it just a couple of layers before the color prediction we are making sure that the network uses information from omega very slightly and is able to generalize better across viewing directions.

Spiral Rendering of Part 3

4.2 Hierarchical Sampling (10 pts)

NeRF employs two networks: a coarse network and a fine network. During the coarse pass, it uses the coarse network to get an estimate of geometry geometry, and during fine pass uses these geometry estimates for better point sampling for the fine network. Implement this hierarchical point-sampling strategy and discuss trade-offs (speed / quality).

4.3 High Resolution Imagery (10 pts)

Run NeRF on high-res imagery using the nerf_lego_highres.yaml config file. This will take a long time to train -- play around with some hyper parameters (point samples per ray, network capacity) and report your results

You can run the code with:

python main.py --config-name=nerf_lego_highres

It will save the high res gif to submission_outputs/part_4_3.gif

I increased the network depth to match the network mentioned in the slides and the number of samples per ray to 256. I reduced the chunk size to 16388 to fit on the GPU.

Spiral Rendering of Part 3