Assignment 4

NAME: Hiresh Gupta

ANDREW ID: hireshg

Late days used: 1

Q1. Classification Model (40 points)

Usage:

python train.py --task cls --checkpoint_dir ./checkpoints
python eval_cls.py --load_checkpoint best_model --checkpoint_dir ./checkpoints --save_visuals --output_dir ./output/q1

Test Accuracy of best model: 97.27%

Visualizations (Correct Predictions)

Point Cloud Ground Truth Prediction
Chair Chair
Chair Chair
Chair Chair
Vase Vase
Vase Vase
Vase Vase
Lamp Lamp
Lamp Lamp
Lamp Lamp

Visualizations (Failure cases)

Point Cloud Ground Truth Prediction
Vase Lamp
Vase Lamp
Vase Lamp
Lamp Vase
Lamp Vase
Lamp Vase

Interpretation:

Based on the above results we can see that the model is doing a good job in predicting 97% of the objects correctly. On checking the failure cases, I found that none of the chair images got misclassified by the Classifier network. This might be because the chair dataset has the most number of training examples & is somewhat simpler to distinguish as compared to vases & lamps.

Based on the other failure cases visualizations, we can see that the model is making a few mistakes in predicting vases & lamps. All the misclassified lamp examples have a neck-like structure which looks very similar to some of the examples in the Vase class. We make a somewhat similar observation while checking the lamp class mispredictions that have a vase-like base.

Q2. Segmentation Model (40 points)

Test accuracy of best model: 89.95%

Usage:

python train.py --task seg --checkpoint_dir ./checkpoints
python eval_seg.py --load_checkpoint best_model  --checkpoint_dir ./checkpoints --output_dir ./output/q2

Visualizations (Accurate Predictions)

Point Cloud (Ground Truth) Prediction Accuracy
0.9958
0.9956
0.9955

Visualizations (Less Accurate Predictions)

Point Cloud (Ground Truth) Prediction Accuracy
0.423
0.457
0.463

Observation:

I have visualized some of the accurate & less accurate predictions above. As we can that the model is doing quite well on certain categories of chairs that have a well-separated backrest, handles, seats, and legs. On visualizing some of the mispredictions, we can see that the model performs poorly on chair shapes that deviate from the standard shapes. The first two images seem to look like a sofa chair and are somewhat ambiguous to segment even for the human annotators. Because of this ambiguous nature and high intra-class variation (having fewer training examples) might be some of the reasons for bad performance.

Q3. Robustness Analysis (20 points)

Experiment 1: Varying the number of points

To test the robustness of the model, I experimented by reducing the number of sampled points to 10, 50, 100, 500, 1000, and 5000. These experiments can be reproduced by running the following commands:

Usage:

# Sample evaluation command to vary the number of points in Point Classification
python eval_cls.py --load_checkpoint best_model --num_points <num_points>

# Sample evaluation command to vary the number of points in Point Segmentation
python eval_seg.py --load_checkpoint best_model --output_dir ./output/q3/num_points --num_points <num_points>

1. Robustness Analysis for Classification

The following table summarizes the model performance on reducing the number of points as described above.

Number of Points Test Accuracy
10 0.4302
50 0.8363
100 0.9139
500 0.9632
1000 0.9706
5000 0.9727
**10000 0.9727**

Interpretation:

As we decrease the number of points, the classification model accuracy only drops by a small percentage. The model achieves an accuracy of 91.39% with just 100 points which illustrate the robustness of the model. By reducing the number of points further to 50 and 10, we find a sharp decrease in the accuracy. This is expected because it is very hard to define the geometry with so few points.

2. Robustness Analysis for Segmentation

The following table summarizes the model performance on reducing the number of points as described above.

Number of Points Test Accuracy Visualization (GT-left vs Prediction-right)
10 0.679
50 0.7556
100 0.7875
500 0.8626
1000 0.8854
5000 0.8987
10000 0.8995

Interpretation:

Given the above results, I would say that the segmentation model is also robust to the number of points since the accuracy doesn't drop much by only using 5% of the points. By reducing the number of points, we see that the model still performs fine on well-defined chairs. But, it struggles with the segmentation of challenging chairs with ambiguous segments.

Experiment 2: Rotating the test data

In this experiment, I have tried rotating the test point cloud along z-axis keeping the number of points constant at 10000. The below results can be reproduced by running the following commands:

Usage:

# Sample evaluation command to vary the number of points in Point Classification
python eval_cls.py --load_checkpoint best_model --rot_angle <rotation_angle_in_degrees>

# Sample evaluation command to vary the number of points in Point Segmentation
python eval_seg.py --load_checkpoint best_model --output_dir ./output/q3/rot --rot_angle <rotation_angle_in_degrees>

1. Robustness analysis for Classification

The below table describes the achieved test accuracy by changing the rotation angles to the following degrees.

Rotation Angle Test Accuracy
0 0.9727
-5 0.9653
5 0.9601
-10 0.9601
10 0.9391
-20 0.9202
20 0.8352
-30 0.8037
30 0.7397
-90 0.3252
90 0.3578

Interpretation:

As evident from the above results, we can see that the model is not robust to the rotation of point clouds. As we increase the angle of rotation (along both clockwise & anticlockwise directions) the performance of the model decreases.

2. Robustness analysis for Segmentation

The below table describes the achieved test accuracy by changing the rotation angles to the following degrees.

Rotation Angle (degrees) Test Accuracy Visualization (GT-left vs Prediction-right)
0 0.8995
-10 0.8613
-20 0.7797
-30 0.6872
-60 0.5181
-90 0.3902
10 0.8504
20 0.7586
30 0.6707
60 0.5175
90 0.4049

Interpretation:

Similar to the classification model, the accuracy falls as the rotation degrees increase. As we can from the above visualizations, the model is trying to assign segmentation value purely on the basis on the location & isn't robust to rotation. This is expected because the model wasn’t trained for such transformations during training.