Goals: In this assignment, you will learn the basics of rendering with PyTorch3D, explore 3D representations, and practice constructing simple geometry.
.
├── data
├── __init__.py
├── main.py
├── p0.py
├── p1.py
├── p2.py
├── p3.py
├── p4.py
├── p5.py
├── p6.py
├── p7.py
├── README.md
└── utils.py
The entire source code is packaged a python module. Each question in the
assignment corresponds to its appropriate program pX.py. main.py can be used to
launch each of the programs.
The programs are written to be invoked as subcommands from main (like git).
Run each of the program with --help option to see more details on how to run
the program. Ex: python3 main.py --help or python3 p1.py --help
$ ./main.py p1 --help
usage: ./main.py p1 [-h] [-d DEVICE] [--image_size IMAGE_SIZE] [--fps FPS] [--num_gif_samples num_gif_samples] [--gif_duration GIF_DURATION] [-v] [--version] COMMAND MODEL_FILE OUTPUT_FILE
16-889 HW1: Rendering Basics with PyTorch3D. P1 - practicing with cameras.
positional arguments:
COMMAND command to run. [render360 | dollyzoom]
MODEL_FILE path to .obj model file
OUTPUT_FILE path of the file to save the rendered image
optional arguments:
-h, --help show this help message and exit
-d DEVICE, --device DEVICE
string representing the device. ex: cpu, cuda, cuda:0, etc
--image_size IMAGE_SIZE
Size of the image to be rendered. Default=256
--fps FPS Fps to generate gif. This overries fps calculated using num_gif_samples/duration
--num_gif_samples num_gif_samples
number of frames to sample. Default=10
--gif_duration GIF_DURATION
Duration, in seconds, of the gif to generate. Default=3
-v, --verbose Verbose mode. Prints logs and debugging information based on loglevel
--version show program's version number and exit
This program performs the following operations - 360-degree render - Dolly zoom
The following sections show how to run the scripts for each of the assignment question. It also contains the answers to each of the corresponding assignment question.
./main.py p0 ./data/cow_mesh/cow.obj ./output/p0.jpg -v
This should print the debug details and generate ./output/p0.jpg as shown below
$ ./main.py p0 -v --visualize ./data/cow_mesh/cow.obj ./output/p0.jpg
2022-02-10 21:24:42,955 - P0 - DEBUG: model file: ./data/cow_mesh/cow.obj
2022-02-10 21:24:42,955 - P0 - DEBUG: output file: ./output/p0.jpg
2022-02-10 21:24:42,955 - P0 - DEBUG: requested device: None
2022-02-10 21:24:42,990 - P0 - DEBUG: using device: cuda:0

./main.py p1 render360 ./data/cow_mesh/cow.obj ./output/p1_1.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p1 dollyzoom ./data/cow_mesh/cow_on_plane.obj ./output/p1_2.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p2 tetra ./output/p2_1.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p2 cube ./output/p2_2.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

$./main.py p3 ./data/cow_mesh/cow.obj ./output/p3.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

I've chosen color1 as blue (0,0,1) and color2 as red (1,0,0). Blue is more
closer to the camera and red is farther.
./main.py p4 ./data/cow_mesh/cow.obj ./output/ --image_size=256 -v

The tranformations here presume a right-handed coordinate system, where y is
up, x is left, ad z is in.
Transformation 1:
The object is rotated around the z axis by 90 degrees. There is no translation. This effect is obtained by rotating the camera by -90 degrees round the z axis.
relative_r = euler_angles_to_matrix(
torch.tensor([0, 0, -np.pi/2]), "XYZ"
)
relative_t= torch.tensor([0., 0., 0.])
Transformation 2:
The object is moveed away from its original position by some x units. This
effect is obtained by translating the camera by x units along the z
axis. There is no rotation.
relative_r = torch.eye(3)
relative_t= torch.tensor([0., 0., 2.])
Transformation 3:
The object is translated in both x and y direction, without any
rotation. This affect is obtained by translating the camera accordingly in
the same x and y axis.
relative_r = torch.eye(3)
relative_t= torch.tensor([0.5, -0.5, 0.])
Transformation 4:
The object is rotated around the y axis by 90 degrees. There is no
translation. This effect is obtaine by rotating the camera by +90 degrees
around the y axis without any tranlation.
relative_r = euler_angles_to_matrix(
torch.tensor([0, np.pi/2, 0]), "XYZ"
)
relative_t= torch.tensor([-3., 0., 3.])
./main.py p5 pointcloud ./output/ --data_file=./data/rgbd_data.pkl --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p5 parametric ./output/p5_2_2.gif --num_samples=1000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p5 implicit ./output/p5_3.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

Each type of representations has its own pros and cons. Pointclouds are easy to represent, but non-continuous, unlike meshes. For that reason meshes are more visually pleasing.
Compared to meshes, pointclouds need a significatnly high number of points to repesent the same surface. However, pointclouds are easier to process, as they are merely a set of points.
Parametric representations with pointclouds uses more number of points and as a result is effect with high computation overhead to render as compared to meshes.
./main.py p6 ./output/p6.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v

./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_1.gif --num_samples=10 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_2.gif --num_samples=100 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_3.gif --num_samples=1000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_4.gif --num_samples=10000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
