16-889 Assignment 1: Rendering Basics with PyTorch3D

Caroline Ai

1. Practicing with Cameras

1.1. 360-degree Renders (5 points)

Your first task is to create a 360-degree gif video that shows many continuous views of the provided cow mesh. For many of your results this semester, you will be expected to show full turntable views of your outputs.

cow render

1.2 Re-creating the Dolly Zoom (10 points)

The Dolly Zoom is a famous camera effect, first used in the Alfred Hitchcock film Vertigo. The core idea is to change the focal length of the camera while moving the camera in a way such that the subject is the same size in the frame, producing a rather unsettling effect. In this task, you will recreate this effect in Pytorch3D.

dolly zoom

2. Practicing with Meshes

2.1 Constructing a Tetrahedron (5 points)

In this part, you will practice working with the geometry of 3D meshes. Construct a tetrahedron mesh and then render it from multiple viewpoints. Your tetrahedron does not need to be a regular tetrahedron (i.e. not all faces need to be equilateral triangles) as long as it is obvious from the renderings that the shape is a tetrahedron.

vertices = [[-1, 2, 1], [0, -1, 2], [2, 0, 0], [-2, -1, -1]] faces = [[0,1,2], [0,1,3], [0,2,3], [1,2,3]]

tetrahedron

2.2 Constructing a Cube (5 points)

Construct a cube mesh and then render it from multiple viewpoints. Remember that we are still working with triangle meshes, so you will need to use two sets of triangle faces to represent one face of the cube.

vertices = [[0,0,0], [1,0,0], [1,1,0], [0,1,0], [0,0,1], [1,0,1], [1,1,1], [0,1,1]] faces = [[0,1,3], [1,2,3], [1,2,5], [2,5,6], [2,3,6], [3,6,7], [5,6,7], [4,5,7], [0,3,4], [3,4,7], [0,1,5], [0,4,5]]

cube

3. Re-texturing a mesh (10 points)

Now let's practice re-texturing a mesh. For this task, we will be retexturing the cow mesh such that the color smoothly changes from the front of the cow to the back of the cow.

In this case, color1 = [255/255,248/255,220/255] and color2 = [205/255,133/255,63/255], which are respectively beige and brown.

Cow render

4. Camera Transformations (20 points)

When working with 3D, finding a reasonable camera pose is often the first step to producing a useful visualization, and an important first step toward debugging.

Running python -m starter.camera_transforms produces the following image using the camera extrinsics rotation R_0 and translation T_0:

Cow render

The following sets of R_relative and T_relative produce the transformations:

R_relative = [[ 0., 1., 0.], [-1., 0., 0.], [ 0., 0., 1.]]

Cow render

T_relative = [0, 0, 2]

Cow render

T_relative = [ 0.5000, -0.5000, 0.0000]

Cow render

R_relative = [[ 4.3711e-08, 0.0000e+00, 1.0000e+00], [ 0.0000e+00, 1.0000e+00, 0.0000e+00], [-1.0000e+00, 0.0000e+00, 4.3711e-08]] T_relative = [-3.0000, -0.0000, 3.0000]

Cow render

5. Rendering Generic 3D Representations

The simplest possible 3D representation is simply a collection of 3D points, each possibly associated with a color feature. PyTorch3D provides functionality for rendering point clouds.

Similar to the mesh rendering, we will need a PointCloud object consisting of 3D points and colors, a camera from which to view the point cloud, and a Pytorch3D Point Renderer which we have wrapped similarly to the Mesh Renderer.

5.1 Rendering Point Clouds from RGB-D Images (10 points)

In this part, we will practice rendering point clouds constructed from 2 RGB-D images from the Common Objects in 3D Dataset.

plant

In render_generic.py, the load_rgbd_data function will load the data for 2 images of the same plant. The dictionary should contain the RGB image, a depth map, a mask, and a Pytorch3D camera corresponding to the pose that the image was taken from.

You should use the unproject_depth_image function in utils.py to convert a depth image into a point cloud (parameterized as a set of 3D coordinates and corresponding color values). The unproject_depth_image function uses the camera intrinsics and extrinisics to cast a ray from every pixel in the image into world coordinates space. The ray's final distance is the depth value at that pixel, and the color of each point can be determined from the corresponding image pixel.

Construct 3 different point clouds:

  1. The point cloud corresponding to the first image

plant

  1. The point cloud corresponding to the second image

plant

  1. The point cloud formed by the union of the first 2 point clouds.

plant

5.2 Parametric Functions (10 points)

A parametric function generates a 3D point for each point in the source domain.

Your task is to render a torus point cloud by sampling its parametric function.

torus

5.3 Implicit Surfaces (15 points)

In this part, we will explore representing geometry as a function in the form of an implicit function. In general, given a function F(x, y, z), we can define the surface to be the zero level-set of F i.e. (x,y,z) such that F(x, y, z) = 0. The function F can be a mathematical equation or even a neural network. To visualize such a representation, we can discretize the 3D space and evaluate the implicit function, storing the values in a voxel grid. Finally, to recover the mesh, we can run the marching cubes algorithm to extract the 0-level set.

In practice, we can generate our voxel coordinates using torch.meshgrid which we will use to query our function (in this case mathematical ones). Once we have our voxel grid, we can use the mcubes library convert into a mesh.

Your task is to render a torus again, this time as a mesh defined by an implicit function.

torus

6. Do Something Fun (10 points)

Now that you have learned to work with various 3D represenations and render them, it is time to try something fun. Create your own 3D structures, or render something in an interesting way, or creatively texture, or anything else that appeals to you - the (3D) world is your oyster! If you wish to download additional meshes, Free3D is a good place to start.

corona

(Extra Credit) 7. Sampling Points on Meshes (10 points)

We will explore how to obtain point clouds from triangle meshes. One obvious way to do this is to simply discard the face information and treat the vertices as a point cloud. However, this might be unresonable if the faces are not of equal size.

Instead, as we saw in the lectures, a solution to this problem is to use a uniform sampling of the surface using stratified sampling. The procedure is as follows:

  1. Sample a face with probability proportional to the area of the face
  2. Sample a random barycentric coordinate uniformly
  3. Compute the corresponding point using baricentric coordinates on the selected face.

For this part, write a function that takes a triangle mesh and the number of samples and outputs a point cloud. Then, using the cow mesh, randomly sample 10, 100, 1000, and 10000 points.

cow mesh

cow pointcloud

cow pointcloud

cow pointcloud

cow pointcloud