16-889 Assignment 2: Single View to 3D

Late Days used - 2

Goals: In this assignment, you will explore the types of loss and decoder functions for regressing to voxels, point clouds, and mesh representation from single view RGB input.

Note:

  1. The instructions to run are mentioned in each section.

1. Exploring loss functions

1.1. Fitting a voxel grid (5 points)

python fit_data.py --type 'vox'

OR

python main.py -q 1.1

Visualization

Optimized Voxel Ground Truth
src src

1.2. Fitting a point cloud (10 points)

python fit_data.py --type 'point'

OR

python main.py -q 1.2

Visualization

Optimized Point Cloud Ground Truth
src src

1.3. Fitting a mesh (5 points)

python fit_data.py --type 'mesh'

OR

python main.py -q 1.3

Visualization

Optimized Mesh Ground Truth
src src

2. Reconstructing 3D from single view

2.1. Image to voxel grid (15 points)

# For training
python train_model.py --type 'vox' --max_iter 10001 --save_freq 2000

# For evaluation
python eval_model.py --type 'vox' --load_checkpoint --load_step 10000 --vis_freq 20

OR

python main.py -q 2.1

Visualizing 3 examples

Ground Truth Image Ground Truth Voxel Predicted Voxel
image gt vox
image gt vox
image gt vox

2.2. Image to point cloud (15 points)

# For training
python train_model.py --type 'point' --max_iter 10001 --save_freq 2000

# For evaluation
python eval_model.py --type 'point' --load_checkpoint --load_step 10000 --vis_freq 20

OR

python main.py -q 2.2

Visualizing 3 examples

Ground Truth Image Ground Truth Voxel Predicted Voxel
image gt vox
image gt vox
image gt vox

2.3. Image to mesh (15 points)

# For training
python train_model.py --type 'mesh' --max_iter 10001 --save_freq 2000

# For evaluation
python eval_model.py --type 'mesh' --load_checkpoint --load_step 10000 --vis_freq 20

OR

python main.py -q 2.3

Visualizing 3 examples

Ground Truth Image Ground Truth Voxel Predicted Voxel
image gt vox
image gt vox
image gt vox

2.4. Quantitative comparisions(10 points)

Avg F1@0.05 Vox Avg F1@0.05 Point Avg F1@0.05 Mesh
74.439 90.849 87.206

2.5. Analyse effects of hyperparms variations (10 points)

I tried playing around with the different tunable hyperparameters. I observed the following

2.6. Interpret your model (15 points)

!python interpret_model.py --load_step 10000 --index1 100 --index2 340

For this question, all my experiments and observations are based on the point cloud encoder-decoder model.

3. (Extra Credit) Exploring some recent architectures.

3.1 Implicit network (10 points)

python train_implicit.py --save_freq 2000 --max_iter 10001

OR

python main.py -q 3.1

Visualizing 3 examples

Ground Truth Image Ground Truth Voxel Predicted Voxel using Implicit Decoder
image gt vox
image gt vox
image gt vox

3.2 Parametric network (10 points)

Implement a parametric function that takes in as input sampled 2D points and outputs their respective 3D point. Some papers for inspiration [1,2]