Assignment 3
Name: Ayush Pandey Andrew id: ayushp
1 Differentiable Volume Rendering
Ray sampling (10 points)
Take a look at the render_images
function in main.py
. It loops through a set of cameras, generates rays for each pixel on a camera, and renders these rays using a Model
instance.
Implementation
Your first task is to implement:
-
get_pixels_from_image
inray_utils.py
and -
get_rays_from_pixels
inray_utils.py
which are used in render_images
:
xy_grid = get_pixels_from_image(image_size, camera)
ray_bundle = get_rays_from_pixels(xy_grid, camera)
Visualization
You can run the code with:
python main.py --config-name=box
It will save two images in submission_outputs folder q1_3_1.png and q1_3_2.png
Point sampling (10 points)
Implementation
Your next task is to fill out StratifiedSampler
in sampler.py
. Implement the forward method, which:
- Generates a set of distances between
near
andfar
and - Uses these distances to sample points offset from ray origins (
RayBundle.origins
) along ray directions (RayBundle.directions
). - Stores the distances and sample points in
RayBundle.sample_points
andRayBundle.sample_lengths
Visualization
You can run the code with:
python main.py --config-name=box
It will save an image named q1_4.png in submission_outputs folder.
Volume rendering (30 points)
Finally, we can implement volume rendering! With the configs/box.yaml
configuration, we provide you with an SDFVolume
instance describing a box. You can check out the code for this function in implicit.py
, which converts a signed distance function into a volume. If you want, you can even implement your own SDFVolume
classes by creating new signed distance function class, and adding it to sdf_dict
in implicit.py
. Take a look at this great web page for formulas for some simple/complex SDFs.
Implementation
You will implement
-
VolumeRenderer._compute_weights
and -
VolumeRenderer._aggregate
. - You will also modify the
VolumeRenderer.forward
method to render a depth map in addition to color from a volume
Visualization
You can run the code with:
python main.py --config-name=box
It will generate part_1.gif and depth.png in submission_output folder
2. Optimizing a basic implicit volume
Box center: (0.25661855936050415, 0.25521400570869446, 0.19203484058380127)
Box side lengths: (1.9380567073822021, 1.4644564390182495, 1.9106501340866089)
Visualization
You can run the code with:
python main.py --config-name=train_box
The code renders a spiral sequence of the optimized volume in submission_outputs/part_2.gif
.
3. Optimizing a Neural Radiance Field (NeRF) (30 points)
In this part, you will implement an implicit volume as a Multi-Layer Perceptron (MLP) in the NeuraRadianceField
class in implicit.py
. This MLP should map 3D position to volume density and color. Specifically:
- Your MLP should take in a
RayBundle
object in its forward method, and produce color and density for each sample point in the RayBundle. - You should also fill out the loss in
train_nerf
in themain.py
file.
Visualization
You can train a NeRF on the lego bulldozer dataset with
python main.py --config-name=nerf_lego
It will train a NeRF for 250 epochs on 128x128 images. After training, a spiral rendering will be written to submission_outputs/part_3.gif
.
4. NeRF Extras (Choose at least one! More than one is extra credit)
4.1 View Dependence (10 pts)
Add view dependence to your NeRF model! Specifically, make it so that emission can vary with viewing direction. You can NeRF or other papers for how to do this effectively --- if you're not careful, your network may overfit to the training images. Discuss the trade-offs between increased view dependence and generalization quality.
You can run the code with:
python main.py --config-name=nerf_lego_view
It will save the view dependent output to submission_outputs/part_4_1.gif
Answer: As Professor Shubham mentioned in the lecture, having omega fed in later in the network adds an inductive bias that the color does not get affected that much by view. We cannot feed omega earlier than when we are predicting density as density is not dependent on viewing direction. However keeping density aside, we don't want the network to overfit to the viewing direction. As the color at a point varies very slightly with the viewing direction we dont want to feed omega earlier in the network and allow it to cheat (or overfit) using the omega. By feeding it just a couple of layers before the color prediction we are making sure that the network uses information from omega very slightly and is able to generalize better across viewing directions.
4.2 Hierarchical Sampling (10 pts)
NeRF employs two networks: a coarse network and a fine network. During the coarse pass, it uses the coarse network to get an estimate of geometry geometry, and during fine pass uses these geometry estimates for better point sampling for the fine network. Implement this hierarchical point-sampling strategy and discuss trade-offs (speed / quality).
4.3 High Resolution Imagery (10 pts)
Run NeRF on high-res imagery using the nerf_lego_highres.yaml
config file. This will take a long time to train -- play around with some hyper parameters (point samples per ray, network capacity) and report your results
You can run the code with:
python main.py --config-name=nerf_lego_highres
It will save the high res gif to submission_outputs/part_4_3.gif
I increased the network depth to match the network mentioned in the slides and the number of samples per ray to 256. I reduced the chunk size to 16388 to fit on the GPU.