Goals: In this assignment, you will learn the basics of rendering with PyTorch3D, explore 3D representations, and practice constructing simple geometry.
.
├── data
├── __init__.py
├── main.py
├── p0.py
├── p1.py
├── p2.py
├── p3.py
├── p4.py
├── p5.py
├── p6.py
├── p7.py
├── README.md
└── utils.py
The entire source code is packaged a python module. Each question in the
assignment corresponds to its appropriate program pX.py
. main.py
can be used to
launch each of the programs.
The programs are written to be invoked as subcommands from main (like git).
Run each of the program with --help
option to see more details on how to run
the program. Ex: python3 main.py --help
or python3 p1.py --help
$ ./main.py p1 --help
usage: ./main.py p1 [-h] [-d DEVICE] [--image_size IMAGE_SIZE] [--fps FPS] [--num_gif_samples num_gif_samples] [--gif_duration GIF_DURATION] [-v] [--version] COMMAND MODEL_FILE OUTPUT_FILE
16-889 HW1: Rendering Basics with PyTorch3D. P1 - practicing with cameras.
positional arguments:
COMMAND command to run. [render360 | dollyzoom]
MODEL_FILE path to .obj model file
OUTPUT_FILE path of the file to save the rendered image
optional arguments:
-h, --help show this help message and exit
-d DEVICE, --device DEVICE
string representing the device. ex: cpu, cuda, cuda:0, etc
--image_size IMAGE_SIZE
Size of the image to be rendered. Default=256
--fps FPS Fps to generate gif. This overries fps calculated using num_gif_samples/duration
--num_gif_samples num_gif_samples
number of frames to sample. Default=10
--gif_duration GIF_DURATION
Duration, in seconds, of the gif to generate. Default=3
-v, --verbose Verbose mode. Prints logs and debugging information based on loglevel
--version show program's version number and exit
This program performs the following operations - 360-degree render - Dolly zoom
The following sections show how to run the scripts for each of the assignment question. It also contains the answers to each of the corresponding assignment question.
./main.py p0 ./data/cow_mesh/cow.obj ./output/p0.jpg -v
This should print the debug details and generate ./output/p0.jpg
as shown below
$ ./main.py p0 -v --visualize ./data/cow_mesh/cow.obj ./output/p0.jpg
2022-02-10 21:24:42,955 - P0 - DEBUG: model file: ./data/cow_mesh/cow.obj
2022-02-10 21:24:42,955 - P0 - DEBUG: output file: ./output/p0.jpg
2022-02-10 21:24:42,955 - P0 - DEBUG: requested device: None
2022-02-10 21:24:42,990 - P0 - DEBUG: using device: cuda:0
./main.py p1 render360 ./data/cow_mesh/cow.obj ./output/p1_1.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p1 dollyzoom ./data/cow_mesh/cow_on_plane.obj ./output/p1_2.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p2 tetra ./output/p2_1.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p2 cube ./output/p2_2.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
$./main.py p3 ./data/cow_mesh/cow.obj ./output/p3.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
I've chosen color1 as blue (0,0,1)
and color2 as red (1,0,0)
. Blue is more
closer to the camera and red is farther.
./main.py p4 ./data/cow_mesh/cow.obj ./output/ --image_size=256 -v
The tranformations here presume a right-handed coordinate system, where y
is
up, x
is left, ad z
is in.
Transformation 1:
The object is rotated around the z
axis by 90 degrees. There is no translation. This effect is obtained by rotating the camera by -90
degrees round the z
axis.
relative_r = euler_angles_to_matrix(
torch.tensor([0, 0, -np.pi/2]), "XYZ"
)
relative_t= torch.tensor([0., 0., 0.])
Transformation 2:
The object is moveed away from its original position by some x
units. This
effect is obtained by translating the camera by x
units along the z
axis. There is no rotation.
relative_r = torch.eye(3)
relative_t= torch.tensor([0., 0., 2.])
Transformation 3:
The object is translated in both x
and y
direction, without any
rotation. This affect is obtained by translating the camera accordingly in
the same x
and y
axis.
relative_r = torch.eye(3)
relative_t= torch.tensor([0.5, -0.5, 0.])
Transformation 4:
The object is rotated around the y
axis by 90 degrees. There is no
translation. This effect is obtaine by rotating the camera by +90
degrees
around the y
axis without any tranlation.
relative_r = euler_angles_to_matrix(
torch.tensor([0, np.pi/2, 0]), "XYZ"
)
relative_t= torch.tensor([-3., 0., 3.])
./main.py p5 pointcloud ./output/ --data_file=./data/rgbd_data.pkl --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p5 parametric ./output/p5_2_2.gif --num_samples=1000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p5 implicit ./output/p5_3.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
Each type of representations has its own pros and cons. Pointclouds are easy to represent, but non-continuous, unlike meshes. For that reason meshes are more visually pleasing.
Compared to meshes, pointclouds need a significatnly high number of points to repesent the same surface. However, pointclouds are easier to process, as they are merely a set of points.
Parametric representations with pointclouds uses more number of points and as a result is effect with high computation overhead to render as compared to meshes.
./main.py p6 ./output/p6.gif --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_1.gif --num_samples=10 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_2.gif --num_samples=100 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_3.gif --num_samples=1000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v
./main.py p7 ./data/cow_mesh/cow.obj ./output/p7_4.gif --num_samples=10000 --image_size=256 --num_gif_samples=60 --gif_duration=3 -v