Your first task is to create a 360-degree gif video that shows many continuous views of the provided cow mesh. For many of your results this semester, you will be expected to show full turntable views of your outputs. You may find the following helpful:
pytorch3d.renderer.look_at_view_transform
:
Given a distance, elevation, and azimuth, this function returns the corresponding
set of rotations and translations to align the world to view coordinate system.import imageio
my_images = ... # List of images [(H, W, 3)]
imageio.mimsave('my_gif.gif', my_images, fps=15)
On your webpage, you should include a gif that shows the cow mesh from many continously changing viewpoints.
The Dolly Zoom is a famous camera effect, first used in the Alfred Hitchcock film Vertigo. The core idea is to change the focal length of the camera while moving the camera in a way such that the subject is the same size in the frame, producing a rather unsettling effect.
In this task, you will recreate this effect in Pytorch3D, producing an output that should look something like this:
You will make modifications to starter/dolly_zoom.py
. You can render your gif by
calling python -m starter.dolly_zoom
.
On your webpage, include a gif with your dolly zoom effect.
In this part, you will practice working with the geometry of 3D meshes. Construct a tetrahedron mesh and then render it from multiple viewpoints. Your tetrahedron does not need to be a regular tetrahedron (i.e. not all faces need to be equilateral triangles) as long as it is obvious from the renderings that the shape is a tetrahedron.
You will need to manually define the vertices and faces of the mesh. Once you have the
vertices and faces, you can define a single-color texture, similarly to the cow in
render_cow.py
. Remember that the faces are the vertex indices of the triangle mesh.
It may help to draw a picture of your tetrahedron and label the vertices and assign 3D coordinates.
On your webpage, show a 360-degree gif animation of your tetrahedron. Also, list how many vertices and (triangle) faces your mesh should have.
We use 4 vertices and 4 faces to mesh tetrahedron.
Vertices torch.Size([1, 4, 3]) Faces torch.Size([1, 4, 3])
Construct a cube mesh and then render it from multiple viewpoints. Remember that we are still working with triangle meshes, so you will need to use two sets of triangle faces to represent one face of the cube.
On your webpage, show a 360-degree gif animation of your cube. Also, list how many vertices and (triangle) faces your mesh should have.
We use 8 vertices and 12 faces to mesh cube.
Vertices torch.Size([1, 8, 3]) Faces torch.Size([1, 12, 3])
Now let's practice re-texturing a mesh. For this task, we will be retexturing the cow mesh such that the color smoothly changes from the front of the cow to the back of the cow.
More concretely, you will pick 2 RGB colors, color1
and color2
. We will assign the
front of the cow a color of color1
, and the back of the cow a color of color2
.
The front of the cow corresponds to the vertex with the smallest z-coordinate z_min
,
and the back of the cow corresponds to the vertex with the largest z-coordinate z_max
.
Then, we will assign the color of each vertex using linear interpolation based on the
z-value of the vertex:
alpha = (z - z_min) / (z_max - z_min)
color = alpha * color2 + (1 - alpha) * color1
Your final output should look something like this:
In this case, color1 = [0, 0, 1]
and color2 = [1, 0, 0]
.
In your submission, describe your choice of color1
and color2
, and include a gif of the
rendered mesh.
In my code, we set color1 = [0, 0, 1]
and color2 = [1, 0, 0]
. The Color varies according to the change of Z value.
The following is specific code.
Code Block
color1 = [0, 0, 1]
color2 = [1, 0, 0]
texture_rgb = vertices[0,:,2].clone()
texture_rgb = (texture_rgb - texture_rgb.min()) / (texture_rgb.max() - texture_rgb.min())
texture_rgb_tmp = torch.empty(len(texture_rgb),3)
for i in range(len(texture_rgb)):
color = texture_rgb[i] * torch.tensor(color2) + (1 -texture_rgb[i]) * torch.tensor(color1)
texture_rgb_tmp[i]=color
When working with 3D, finding a reasonable camera pose is often the first step to producing a useful visualization, and an important first step toward debugging.
Running python -m starter.camera_transforms
produces the following image using
the camera extrinsics rotation R_0
and translation T_0
:
What are the relative camera transformations that would produce each of the following
output images? You shoud find a set (R_relative, T_relative) such that the new camera
extrinsics with R = R_relative @ R_0
and T = R_relative @ T_0 + T_relative
produces
each of the following images:
In your report, describe in words what R_relative and T_relative should be doing and include the rendering produced by your choice of R_relative and T_relative.
First Image: For the fist image, we need to rotate camera -90 degree along z direction.
relative_rotation = pytorch3d.transforms.euler_angles_to_matrix( torch.tensor([0, 0, -np.pi/2]), "XYZ")
Relative_rotation([[-4.3711e-08, 1.0000e+00, 0.0000e+00],
[-1.0000e+00, -4.3711e-08, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 1.0000e+00]])
Relative_translation([0,0,3])
Second Image: For the second image, we don't need any rotation and just translate the camera along the z direction.
relative_rotation = pytorch3d.transforms.euler_angles_to_matrix( torch.tensor([0, 0, 0]), "XYZ")
Relative_rotation([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Relative_translation([0,0,5])
Thrid Image: For the third image, we need small rotation along the Y direction and small translation along X direction.
relative_rotation = pytorch3d.transforms.euler_angles_to_matrix( torch.tensor([0, np.pi/40, 0]), "XYZ")
Relative_rotation([[ 0.9969, 0.0000, 0.0785], [ 0.0000, 1.0000, 0.0000], [-0.0785, 0.0000, 0.9969]])
Relative_translation([0.5,0,3])
Fourth Image:
For the Fourth image, we need to roatate camera 90 degree along Y direction.
relative_rotation = pytorch3d.transforms.euler_angles_to_matrix(
torch.tensor([0, np.pi/2, 0]), "XYZ")
Relative_rotation([[-4.3711e-08, 0.0000e+00, 1.0000e+00],
[ 0.0000e+00, 1.0000e+00, 0.0000e+00],
[-1.0000e+00, 0.0000e+00, -4.3711e-08]])
Relative_translation([0,0,3])
In this part, we will practice rendering point clouds constructed from 2 RGB-D images from the Common Objects in 3D Dataset.
In render_generic.py
, the load_rgbd_data
function will load the data for 2 images of the same
plant. The dictionary should contain the RGB image, a depth map, a mask, and a
Pytorch3D camera corresponding to the pose that the image was taken from.
You should use the unproject_depth_image
function in utils.py
to convert a depth
image into a point cloud (parameterized as a set of 3D coordinates and corresponding
color values). The unproject_depth_image
function uses the camera
intrinsics and extrinisics to cast a ray from every pixel in the image into world
coordinates space. The ray's final distance is the depth value at that pixel, and the
color of each point can be determined from the corresponding image pixel.
Construct 3 different point clouds:
Try visualizing each of the point clouds from various camera viewpoints. We suggest starting with cameras initialized 6 units from the origin with equally spaced azimuth values.
In your submission, include a gif of each of these point clouds side-by-side.
A parametric function generates a 3D point for each point in the source domain. For example, given an elevation theta and azimuth phi, we can parameterize the surface of a unit sphere as (sin(theta) cos(phi), cos(theta), sin(theta) sin(phi)).
By sampling values of theta and phi, we can generate a sphere point cloud. You can render a sphere point cloud by calling python -m starter.render_generic --render parametric. Note that the amount of samples can have an effect on the appearance quality. Below, we show the output with a 100x100 grid of (phi, theta) pairs (--num_samples 100) as well as a 1000x1000 grid (--num_samples 1000). The latter may take a long time to run on CPU.
In your writeup, include a 360-degree gif of your torus point cloud, and make sure the hole is visible. You may choose to texture your point cloud however you wish.
In your writeup, include a 360-degree gif of your torus mesh, and make sure the hole is visible. In addition, discuss some of the tradeoffs between rendering as a mesh vs a point cloud. Things to consider might include rendering speed, rendering quality, ease of use, memory usage, etc.
Comparison between the mesh and point cloud:
Now that you have learned to work with various 3D represenations and render them, it is time to try something fun. Create your own 3D structures, or render something in an interesting way, or creatively texture, or anything else that appeals to you - the (3D) world is your oyster! If you wish to download additional meshes, Free3D is a good place to start.
I change the texture for our cow and it looks like a Jewelry cow.