I used three late days for this assignment.
To run: python fit_data.py
Ground Truth | Optimized Voxel Grid |
---|---|
![]() |
![]() |
To run: python fit_data.py --type 'point'
Ground Truth | Optimized Point Cloud |
---|---|
![]() |
![]() |
To run: python fit_data.py --type 'mesh'
Ground Truth | Optimized Mesh |
---|---|
![]() |
![]() |
Train: python train_model.py --type 'vox'
.
Evaluate: python eval_model.py --type 'vox' --load_checkpoint
RGB Image | Ground Truth | Predicted |
---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Train: python train_model.py --type 'point'
.
Evaluate: python eval_model.py --type 'point' --load_checkpoint
RGB Image | Ground Truth | Predicted Point Cloud |
---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Train: python train_model.py --type 'mesh'
.
Evaluate: python eval_model.py --type 'mesh' --load_checkpoint
RGB Image | Ground Truth | Predicted Mesh |
---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
For evaluating you can run:
python eval_model.py --type voxel|mesh|point --load_checkpoint
Method | Avg Test F1 Score |
---|---|
Voxels | 77.453 |
Point Cloud | 93.154 |
Mesh | 93.553 |
Voxels performed the worst in terms of F1, and point cloud and mesh performed similarly. Point cloud and mesh sample points from their respective representations and are able to better express thinner structures compared to a fixed voxel size grid. With voxels, either the thin structure would be completely ignored or be predicted as much bigger resulting in more false positives.
I tried different values of w_smooth
to investigate the effects on the predicted outputs.
As we increase the weight of the smoothing, the chamfer distance weight is effectively reduced. As such, the model will care less about the errors from the point cloud matching, and rather focus on making the mesh smooth. We can see this clearly below with the different smooth values.
Smoothness | Example 1 | Example 2 | Example 3 | Example 4 | Example 5 |
---|---|---|---|---|---|
0.1 | ![]() |
![]() |
![]() |
![]() |
![]() |
100.0 | ![]() |
![]() |
![]() |
![]() |
![]() |
1000.0 | ![]() |
![]() |
![]() |
![]() |
![]() |
In terms of quantitative performance, we can see the higher smoothness values performed worse and worse.
Smoothness Weight | Avg Test F1 Score |
---|---|
0.1 | 93.553 |
100.0 | 91.254 |
1000.0 | 88.124 |
Overall, visually 100.0 smoothness looks the best as there aren't as many pointy edges compared to 0.1, while still capturing more details than 1000.0. In effect, this hyperparameter seems to act like a regularizer.
We can record the outputs of each layer in the decoder and cluster the vectors across the entire test batch using L2 norm. Running k-means on the second to last layer in the decoder with 10 clusters yielded the following:
Cluster | Example 1 | Example 2 | Example 3 | Example 4 | Example 5 |
---|---|---|---|---|---|
1 | ![]() |
![]() |
![]() |
![]() |
![]() |
2 | ![]() |
![]() |
![]() |
![]() |
![]() |
3 | ![]() |
![]() |
![]() |
![]() |
![]() |
4 | ![]() |
![]() |
![]() |
![]() |
![]() |
Each of the chairs across the rows belong to the same cluster and are determined as similar to each other. Some clusters are more interpretable than others. For example, cluster 1 tends to prefer similar sized chairs all with similar orientations. Cluster 4 seems to cluster chairs with legs that go farther away from each other. While these are not perfect "visible" features, our model believes these chairs to be similar feature-wise.
Implement a implicit decoder that takes in as input 3D locations and outputs the occupancy value. Some papers for inspiration [1,2]
Train: python train_model.py --type 'parametric'
.
Evaluate: python eval_model.py --type 'parametric' --load_checkpoint
I sample points on a plane and sum over 3 different MLP which then predicts the 3d point outputs. The average test F1 score was 0.906. Note this model was only trained for 1000 steps, unlike previous models which were trained for 10000 steps. Performance could potentially improve by expanding the MLPs more, however, the complexity and the size of the model would also grow tremendously.
RGB Image | Ground Truth | Predicted Point Cloud |
---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |