Assignment 4
NAME: Hiresh Gupta
ANDREW ID: hireshg
Late days used: 1
Q1. Classification Model (40 points)
Usage:
python train.py --task cls --checkpoint_dir ./checkpoints
python eval_cls.py --load_checkpoint best_model --checkpoint_dir ./checkpoints --save_visuals --output_dir ./output/q1
Test Accuracy of best model: 97.27%
Visualizations (Correct Predictions)
Point Cloud | Ground Truth | Prediction |
---|---|---|
![]() |
Chair | Chair |
![]() |
Chair | Chair |
![]() |
Chair | Chair |
![]() |
Vase | Vase |
![]() |
Vase | Vase |
![]() |
Vase | Vase |
![]() |
Lamp | Lamp |
![]() |
Lamp | Lamp |
![]() |
Lamp | Lamp |
Visualizations (Failure cases)
Point Cloud | Ground Truth | Prediction |
---|---|---|
![]() |
Vase | Lamp |
![]() |
Vase | Lamp |
![]() |
Vase | Lamp |
![]() |
Lamp | Vase |
![]() |
Lamp | Vase |
![]() |
Lamp | Vase |
Interpretation:
Based on the above results we can see that the model is doing a good job in predicting 97% of the objects correctly. On checking the failure cases, I found that none of the chair images got misclassified by the Classifier network. This might be because the chair dataset has the most number of training examples & is somewhat simpler to distinguish as compared to vases & lamps.
Based on the other failure cases visualizations, we can see that the model is making a few mistakes in predicting vases & lamps. All the misclassified lamp examples have a neck-like structure which looks very similar to some of the examples in the Vase class. We make a somewhat similar observation while checking the lamp class mispredictions that have a vase-like base.
Q2. Segmentation Model (40 points)
Test accuracy of best model: 89.95%
Usage:
python train.py --task seg --checkpoint_dir ./checkpoints
python eval_seg.py --load_checkpoint best_model --checkpoint_dir ./checkpoints --output_dir ./output/q2
Visualizations (Accurate Predictions)
Point Cloud (Ground Truth) | Prediction | Accuracy |
---|---|---|
![]() |
![]() |
0.9958 |
![]() |
![]() |
0.9956 |
![]() |
![]() |
0.9955 |
Visualizations (Less Accurate Predictions)
Point Cloud (Ground Truth) | Prediction | Accuracy |
---|---|---|
![]() |
![]() |
0.423 |
![]() |
![]() |
0.457 |
![]() |
![]() |
0.463 |
Observation:
I have visualized some of the accurate & less accurate predictions above. As we can that the model is doing quite well on certain categories of chairs that have a well-separated backrest, handles, seats, and legs. On visualizing some of the mispredictions, we can see that the model performs poorly on chair shapes that deviate from the standard shapes. The first two images seem to look like a sofa chair and are somewhat ambiguous to segment even for the human annotators. Because of this ambiguous nature and high intra-class variation (having fewer training examples) might be some of the reasons for bad performance.
Q3. Robustness Analysis (20 points)
Experiment 1: Varying the number of points
To test the robustness of the model, I experimented by reducing the number of sampled points to 10, 50, 100, 500, 1000, and 5000. These experiments can be reproduced by running the following commands:
Usage:
# Sample evaluation command to vary the number of points in Point Classification
python eval_cls.py --load_checkpoint best_model --num_points <num_points>
# Sample evaluation command to vary the number of points in Point Segmentation
python eval_seg.py --load_checkpoint best_model --output_dir ./output/q3/num_points --num_points <num_points>
1. Robustness Analysis for Classification
The following table summarizes the model performance on reducing the number of points as described above.
Number of Points | Test Accuracy |
---|---|
10 | 0.4302 |
50 | 0.8363 |
100 | 0.9139 |
500 | 0.9632 |
1000 | 0.9706 |
5000 | 0.9727 |
**10000 | 0.9727** |
Interpretation:
As we decrease the number of points, the classification model accuracy only drops by a small percentage. The model achieves an accuracy of 91.39% with just 100 points which illustrate the robustness of the model. By reducing the number of points further to 50 and 10, we find a sharp decrease in the accuracy. This is expected because it is very hard to define the geometry with so few points.
2. Robustness Analysis for Segmentation
The following table summarizes the model performance on reducing the number of points as described above.
Number of Points | Test Accuracy | Visualization (GT-left vs Prediction-right) |
---|---|---|
10 | 0.679 |
![]() ![]() |
50 | 0.7556 |
![]() ![]() |
100 | 0.7875 |
![]() ![]() |
500 | 0.8626 |
![]() ![]() |
1000 | 0.8854 |
![]() ![]() |
5000 | 0.8987 |
![]() ![]() |
10000 | 0.8995 |
![]() ![]() |
Interpretation:
Given the above results, I would say that the segmentation model is also robust to the number of points since the accuracy doesn't drop much by only using 5% of the points. By reducing the number of points, we see that the model still performs fine on well-defined chairs. But, it struggles with the segmentation of challenging chairs with ambiguous segments.
Experiment 2: Rotating the test data
In this experiment, I have tried rotating the test point cloud along z-axis keeping the number of points constant at 10000. The below results can be reproduced by running the following commands:
Usage:
# Sample evaluation command to vary the number of points in Point Classification
python eval_cls.py --load_checkpoint best_model --rot_angle <rotation_angle_in_degrees>
# Sample evaluation command to vary the number of points in Point Segmentation
python eval_seg.py --load_checkpoint best_model --output_dir ./output/q3/rot --rot_angle <rotation_angle_in_degrees>
1. Robustness analysis for Classification
The below table describes the achieved test accuracy by changing the rotation angles to the following degrees.
Rotation Angle | Test Accuracy |
---|---|
0 | 0.9727 |
-5 | 0.9653 |
5 | 0.9601 |
-10 | 0.9601 |
10 | 0.9391 |
-20 | 0.9202 |
20 | 0.8352 |
-30 | 0.8037 |
30 | 0.7397 |
-90 | 0.3252 |
90 | 0.3578 |
Interpretation:
As evident from the above results, we can see that the model is not robust to the rotation of point clouds. As we increase the angle of rotation (along both clockwise & anticlockwise directions) the performance of the model decreases.
2. Robustness analysis for Segmentation
The below table describes the achieved test accuracy by changing the rotation angles to the following degrees.
Rotation Angle (degrees) | Test Accuracy | Visualization (GT-left vs Prediction-right) |
---|---|---|
0 | 0.8995 |
![]() ![]() |
-10 | 0.8613 |
![]() ![]() |
-20 | 0.7797 |
![]() ![]() |
-30 | 0.6872 |
![]() ![]() |
-60 | 0.5181 |
![]() ![]() |
-90 | 0.3902 |
![]() ![]() |
10 | 0.8504 |
![]() ![]() |
20 | 0.7586 |
![]() ![]() |
30 | 0.6707 |
![]() ![]() |
60 | 0.5175 |
![]() ![]() |
90 | 0.4049 |
![]() ![]() |
Interpretation:
Similar to the classification model, the accuracy falls as the rotation degrees increase. As we can from the above visualizations, the model is trying to assign segmentation value purely on the basis on the location & isn't robust to rotation. This is expected because the model wasn’t trained for such transformations during training.