16-889 Assignment 1: Rendering Basics with PyTorch3D

Goals: In this assignment, you will learn the basics of rendering with PyTorch3D, explore 3D representations, and practice constructing simple geometry.

Files

.
├── data
├── __init__.py
├── main.py
├── p0.py
├── p1.py
├── p2.py
├── p3.py
├── p4.py
├── p5.py
├── p6.py
├── p7.py
├── README.md
└── utils.py

The entire source code is packaged a python module. Each question in the assignment corresponds to its appropriate program pX.py. main.py can be used to launch each of the programs.
The programs are written to be invoked as subcommands from main (like git).
Run each of the program with --help option to see more details on how to run the program. Ex: python3 main.py --help or python3 p1.py --help

$ ./main.py p1 --help
usage: ./main.py p1 [-h] [-d DEVICE] [--image_size IMAGE_SIZE] [--fps FPS] [--num_gif_samples num_gif_samples] [--gif_duration GIF_DURATION] [-v] [--version] COMMAND MODEL_FILE OUTPUT_FILE

16-889 HW1: Rendering Basics with PyTorch3D. P1 - practicing with cameras.

positional arguments:
  COMMAND               command to run. [render360 | dollyzoom]
  MODEL_FILE            path to .obj model file
  OUTPUT_FILE           path of the file to save the rendered image

optional arguments:
  -h, --help            show this help message and exit
  -d DEVICE, --device DEVICE
                        string representing the device. ex: cpu, cuda, cuda:0, etc
  --image_size IMAGE_SIZE
                        Size of the image to be rendered. Default=256
  --fps FPS             Fps to generate gif. This overries fps calculated using num_gif_samples/duration
  --num_gif_samples num_gif_samples
                        number of frames to sample. Default=10
  --gif_duration GIF_DURATION
                        Duration, in seconds, of the gif to generate. Default=3
  -v, --verbose         Verbose mode. Prints logs and debugging information based on loglevel
  --version             show program's version number and exit

This program performs the following operations - 360-degree render - Dolly zoom

The following sections show how to run the scripts for each of the assignment question. It also contains the answers to each of the corresponding assignment question.

0. Test setup

./main.py p0 ./data/cow_mesh/cow.obj ./output/p0.jpg -v

This should print the debug details and generate ./output/p0.jpg as shown below

$ ./main.py p0 -v --visualize ./data/cow_mesh/cow.obj ./output/p0.jpg
2022-02-10 21:24:42,955 - P0 - DEBUG: model file: ./data/cow_mesh/cow.obj
2022-02-10 21:24:42,955 - P0 - DEBUG: output file: ./output/p0.jpg
2022-02-10 21:24:42,955 - P0 - DEBUG: requested device: None
2022-02-10 21:24:42,990 - P0 - DEBUG: using device: cuda:0

cow render

1. Practicing with Cameras

1.1. 360-degree Renders (5 points)

./main.py p1 render360 ./data/cow_mesh/cow.obj ./output/p1_1.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P1_1 result

1.2 Re-creating the Dolly Zoom (10 points)

./main.py p1 dollyzoom ./data/cow_mesh/cow_on_plane.obj ./output/p1_2.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P1_2 result

2. Practicing with Meshes

2.1 Constructing a Tetrahedron (5 points)

./main.py p2 tetra ./output/p2_1.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P2_1 result

Number of vertices = 4
Number of faces = 4

2.2 Constructing a Cube (5 points)

./main.py p2 cube ./output/p2_2.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P2_2 result

Number of vertices = 8
Number of faces = 12

3. Re-texturing a mesh (10 points)

$./main.py p3 ./data/cow_mesh/cow.obj ./output/p3.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P3 result

I've chosen color1 as blue (0,0,1) and color2 as red (1,0,0). Blue is more closer to the camera and red is farther.

4. Camera Transformations (20 points)

./main.py p4 ./data/cow_mesh/cow.obj ./output/ --image_size=256 -v

P4_1 result P4_2 result P4_3 result P4_4 result

The tranformations here presume a right-handed coordinate system, where y is up, x is left, ad z is in.

Transformation 1: The object is rotated around the z axis by 90 degrees. There is no translation. This effect is obtained by rotating the camera by -90 degrees round the z axis.
```
relative_r = euler_angles_to_matrix(
    torch.tensor([0, 0, -np.pi/2]), "XYZ"
)
relative_t= torch.tensor([0., 0., 0.])
```
Transformation 2: The object is moveed away from its original position by some x units. This effect is obtained by translating the camera by x units along the z axis. There is no rotation.
```
relative_r = torch.eye(3)
relative_t= torch.tensor([0., 0., 2.])
```
Transformation 3: The object is translated in both x and y direction, without any rotation. This affect is obtained by translating the camera accordingly in the same x and y axis.
```
relative_r = torch.eye(3)
relative_t= torch.tensor([0.5, -0.5, 0.])
```
Transformation 4: The object is rotated around the y axis by 90 degrees. There is no translation. This effect is obtaine by rotating the camera by +90 degrees around the y axis without any tranlation.
```
relative_r = euler_angles_to_matrix(
    torch.tensor([0, np.pi/2, 0]), "XYZ"
)
relative_t= torch.tensor([-3., 0., 3.])
```

5. Rendering Generic 3D Representations

5.1 Rendering Point Clouds from RGB-D Images (10 points)

./main.py p5 pointcloud  ./output/ --data_file=./data/rgbd_data.pkl --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P5_1_1 result P5_1_2 result P5_1_3 result

5.2 Parametric Functions (10 points)

./main.py p5 parametric  ./output/p5_2_2.gif --num_samples=1000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P5_2_1 result P5_2_2 result

5.3 Implicit Surfaces (15 points)

./main.py p5 implicit  ./output/p5_3.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P5_3 result

Each type of representations has its own pros and cons. Pointclouds are easy to represent, but non-continuous, unlike meshes. For that reason meshes are more visually pleasing.
Compared to meshes, pointclouds need a significatnly high number of points to repesent the same surface. However, pointclouds are easier to process, as they are merely a set of points.
Parametric representations with pointclouds uses more number of points and as a result is effect with high computation overhead to render as compared to meshes.

6. Do Something Fun (10 points)

./main.py p6 ./output/p6.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P6 result

(Extra Credit) 7. Sampling Points on Meshes (10 points)

./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_1.gif --num_samples=10 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_2.gif --num_samples=100 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_3.gif --num_samples=1000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_4.gif --num_samples=10000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

P7_1 result P7_2 result P7_3 result P7_4 result