1.1. Fitting a voxel grid (5 points)

I fitted the voxel grid with BCEloss (with Logits) and use marching cube to convert it to mesh before rendering with a PointLight at (0,0,0). The resulting mesh and the target mesh look the same.

1.2. Fitting a point cloud (10 points)

I fit the point cloud using chamfer loss via using function knn_points. The visualized point clouds look the same:

1.3. Fitting a mesh (5 points)

I added the mesh_laplacian_smoothing smoothing loss. The resulting mesh and target mesh look the same:

2.1. Image to voxel grid (15 points)

I used the network consisting of transposed conv3D layer connected by ReLU() as demonstrated in course slide to fit the voxels.

Here I show visuals of three examples in the test set. For each example show the input RGB, render of the predicted 3D voxel grid and a render of the ground truth mesh.

Voxel 0

GT Mesh 0

GT Rendering 0

Voxel 0 GT Mesh 0 GT Rendering 0

Voxel 100

GT Mesh 100

GT Rendering 100

Voxel 100 GT Mesh 100 GT Rendering 100

Voxel 200

GT Mesh 200

GT Rendering 200

Voxel 200 GT Mesh 200 GT Rendering 200

2.2. Image to point cloud (15 points)

The docoder I defined is simply a linear layer.

Here I include visuals of three examples in the test set. For each example I show the input RGB, render of the predicted 3D point cloud and a render of the ground truth mesh.

PC 0

GT Mesh 0

GT Rendering 0

PC 0 GT Mesh 0 GT Rendering 0

PC 100

GT Mesh 100

GT Rendering 100

PC 100 GT Mesh 100 GT Rendering 100

PC 200

GT Mesh 200

GT Rendering 200

PC 200 GT Mesh 200 GT Rendering 200

2.3. Image to mesh (15 points)

I used a 4 layer MLP connected by ReLU and finally a Tanh layer.

I include visuals of three examples in the test set. For each example I show the input RGB, render of the predicted mesh and a render of the ground truth mesh.

Mesh 0

GT Mesh 0

GT Rendering 0

Mesh 0 GT Mesh 0 GT Rendering 0

Mesh 100

GT Mesh 100

GT Rendering 100

Mesh 100 GT Mesh 100 GT Rendering 100

Mesh 200

GT Mesh 200

GT Rendering 200

Mesh 200 GT Mesh 200 GT Rendering 200

2.4. Quantitative comparisions(10 points)

Quantitatively compare the F1 score of 3D reconstruction for meshes vs pointcloud vs voxelgrids. Provide an intutive explaination justifying the comparision.

Here I include the average test F1 score at 0.05 threshold for voxelgrid, pointcloud and the mesh network.

It can be seen that fitting point cloud gives the most accurate results in terms of F1 and fitting voxels are the worst. This result makes sense because for point cloud fitting, we directly use chamfer loss to optimize for the reconstructed point locations, and F1 is exactly measuring whether predicted points and ground truth are close. For mesh fitting, it performs reasonably well because we also use chamfer loss on sampled points. For voxels however, we do not optimize for sampled point location. Another reason might because the voxel representation is too small (32x32x32) to accurate capture the precise location.

Average Test F1@0.05

Voxel Fitting

51.014

Point Cloud Fitting

90.299

Mesh Fitting

83.621

2.5. Analyse effects of hyperparms variations (10 points)

Analyse the results, by varying an hyperparameter of your choice. For example n_points or vox_size or w_chamfer or initial mesh(ico_sphere) etc. Try to be unique and conclusive in your analysis.

I tried to change the initial mesh shape while fitting the mesh. Except for ico_sphere(4), I also tried initializing from (1) a torus, (2) a cow from assignment 1, (3) a dolphin downloaded from pytorch3d library.

GT Mesh 0

Mesh from sphere 0

Mesh from torus 0

Mesh from cow 0

Mesh from dolphin 0

GT Mesh 0 Mesh 0 Mesh torus 0 Mesh cow 0 Mesh dolphin 0
GT Mesh 100 Mesh 100 Mesh torus 100 Mesh cow 100 Mesh dolphin 100
GT Mesh 200 Mesh 200 Mesh torus 200 Mesh cow 200 Mesh dolphin 200

I also calculate the average F1 score. It can be seen that torus as initial shape achieves the best result, presumably because it is closer to chairs that sometimes have holes:

Average Test F1@0.05

Mesh from sphere

83.621

Mesh from torus

86.503

Mesh from dolphin

85.418

Mesh from cow

83.221

2.6. Interpret your model (15 points)

Simply seeing final predictions and numerical evaluations is not always insightful. Can you create some visualizations that help highlight what your learned model does? Be creative and think of what visualizations would help you gain insights. There is no `right' answer - although reading some papers to get inspiration might give you ideas.

I tried to plot the voxels (with probability greater than 0.5) with their probability values in JET colorspace. It seems the voxels are fitted properly. Especially, the back and legs of the chairs have very high probability values (red means high, green means low). However, we can see that the periphery of the chairs/sofa have generally lower probability values. It suggests most higher value voxels are centered towards the object centroid, which may results from low capacity decoder that underfits to various types of chairs.

Probability Voxel 0

Probability Voxel 100

Probability Voxel 500

GT Mesh 0 Mesh 0 Mesh torus 0