Assignment 5
Submitted by: Naveen Venkat (nvenkat)
Late Days: 2

1. Classification Model (40 points)
1.1 Implementation Details
The classification model has been implemented in models.py
and largely follows the description of the model in the paper Qi et al., CVPR 2017. The model contains two blocks of MLPs, followed by a max-pool operation across the batch of inputs, and finally an MLP classifier. The MLPs contain ReLU activation followed by BatchNorm. The pre-classifier layer which contains a dropout layer.
To train the model, run:
python3 train.py --task cls --num_workers 4
To evaluate the model, run:
python3 eval_cls.py --load_checkpoint=best_model --output_dir=output
1.2 Accuracy
Accuracy of the best model: 97.90 %
1.3 Visualizations: Correct Predictions
Index | GT Class | Predicted Class | Visualization |
---|---|---|---|
201 | 0 | 0 | ![]() |
616 | 0 | 0 | ![]() |
219 | 0 | 0 | ![]() |
20 | 1 | 1 | ![]() |
39 | 1 | 1 | ![]() |
49 | 1 | 1 | ![]() |
61 | 2 | 2 | ![]() |
78 | 2 | 2 | ![]() |
154 | 2 | 2 | ![]() |
1.4 Visualizations: Mis-predictions
Index | GT Class | Predicted Class | Visualization |
---|---|---|---|
406 | 0 | 2 | ![]() |
56 | 1 | 2 | ![]() |
108 | 2 | 1 | ![]() |
1.5 Interpretation
As can be seen from Section 1.3 and 1.4 above, the predictions are correct for regular objects (e.g. chairs are in their regular shape). However, mispredictions happen due to irregular shapes (in the first chair example in Section 1.4, it is folded and is mispredicted as lamp). Further, the model sometimes confuses between vase and lamp - in the mispredicted examples above, example 56 looks like a chandelier (lamp) whereas its GT class is vase. Likewise, example 108 can be confused with a wide pot (vase) whereas its GT class is a lamp. Nevertheless, the high accuracy (~98%) of the model suggests that it could be as accurate as a human annotator.
2. Segmentation Model (40 points)
2.1 Implementation Details
The classification model has been implemented in models.py
and largely follows the description of the model in the paper Qi et al., CVPR 2017. The model contains two blocks of MLPs, followed by a max-pool operation across the batch of inputs, followed by a couple of MLPs to get segmentation results. The MLPs contain ReLU activation followed by BatchNorm. The local and global features are concatenated before sending to the segmentation head.
To train the model, run:
python3 train.py --task seg --num_workers 4
To evaluate the model, run:
python3 eval_cls.py --load_checkpoint='best_model' --exp_name=q2_eval
2.2 Accuracy
Accuracy of the best model: 89.85 %
2.3 Visualizations: Good Predictions
Index | Accuracy | GT Segmentation | Predicted Segmentation |
---|---|---|---|
562 | 99.28 | ![]() |
![]() |
591 | 99.29 | ![]() |
![]() |
297 | 99.35 | ![]() |
![]() |
397 | 99.42 | ![]() |
![]() |
471 | 99.67 | ![]() |
![]() |
2.4 Visualizations: Bad Predictions
Index | Accuracy | GT Segmentation | Predicted Segmentation |
---|---|---|---|
26 | 47.99 | ![]() |
![]() |
351 | 48.95 | ![]() |
![]() |
235 | 51.12 | ![]() |
![]() |
96 | 51.83 | ![]() |
![]() |
255 | 51.99 | ![]() |
![]() |
2.5 Interpretation
Here, we find that the good predictions usually come from regular chairs - having 4 thin legs, a back-rest, with no additional objects on them. These are most easy to segment as they have well distinguishable parts. However, the incorrectly segmented chairs are more complex and contain other objects, e.g. a pillow, that are misclassified. For example, object 26 contains head-rest and arm-rest which are difficult for the model to segment properly. Likewise, for object 351, the model predicts the bottom half as a base, whereas they are labeled as arm-rests in the ground-truth.
Q3. Robustness Analysis (20 points)
3.1 Classification Model
I conducted two experiments:
- Varying the number of sampled points
- Rotating the world orientation
3.1.1 Robustness to the number of points
The model is evaluated with varying number of points summarized as follows:
#points | 10 | 20 | 50 | 100 | 200 | 500 | 1k | 2k | 5k | 10k |
---|---|---|---|---|---|---|---|---|---|---|
Accuracy (%) | 44.39 | 43.65 | 80.79 | 92.86 | 94.75 | 96.85 | 96.74 | 97.48 | 98.01 | 97.90 |
This suggests that the model is quite robust to the number of sampled points (even with two orders less number of points, the model is able to extract the global shape - 92.6% @ 100 points versus 97.9% @ 10k points).
Implementation. To reproduce the results, run
python3 eval_cls_robust.py --load_checkpoint=best_model --num_points=10 \
--output_dir=output_robust/cls/num_points_10
modifying the path to the dumped weights (best_model.pt
is chosen by default). The following script is used to obtain results for the table above:
./q3_cls_eval.sh
The results along with sample outputs will be saved in output_robust/cls/num_points_10
.
3.1.2 Robustness to the world orientation
The model is evaluated with rotated point clouds. Here, I generted a 3x3
rotation matrix with rotation about the z-axis
with degrees in [-180, -90, ..., 90, 180]
as shown below (degrees=0 denotes no rotation).
degrees | -180 | -90 | -40 | -20 | -10 | 0 | 10 | 20 | 40 | 90 | 180 |
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy (%) | 55.19 | 24.87 | 64.22 | 92.34 | 95.48 | 97.90 | 95.38 | 88.66 | 41.44 | 22.03 | 55.19 |
The following can be observed from the table above:
With increasing angular deviation, the accuracy falls rapidly. For small deviations (upto 10 degrees) there is only a little drop (around ~2%), however for larger deviations (especially right angle rotations) the accuracy takes a hit.
Clockwise and Anticlockwise rotations have different effects on the accuracy (compare 40 and -40 degrees for instance). This also suggests that there is assymetry in the objects which affects recognition performance.
A simple sanity check from the table: for 180 & -180 degrees, the performance is identical as both the rotations yield the same configuration.
Implementation. To reproduce the results, run
python3 eval_cls_robust_rotate.py --load_checkpoint=best_model \
--degrees=10 \
--output_dir=output_robust/cls/degrees_10
modifying the path to the dumped weights (best_model.pt
is chosen by default). The following script is used to obtain results for the table above:
./q3_cls_eval_rotate.sh
The results along with sample outputs will be saved in output_robust_rotate/cls/degrees_10
.
3.2 Segmentation Model
Similar to the classification model, I conducted two experiments:
- Varying the number of sampled points
- Rotating the world orientation
3.2.1 Robustness to the number of points
The model is evaluated with varying number of points summarized as follows:
#points | 10 | 20 | 50 | 100 | 200 | 500 | 1k | 2k | 5k | 10k |
---|---|---|---|---|---|---|---|---|---|---|
Accuracy (%) | 65.17 | 74.83 | 79.34 | 81.56 | 84.57 | 88.19 | 89.53 | 89.80 | 89.88 | 89.85 |
The model is mostly robust for number of sampled points greater than 500.
Implementation. To reproduce the results, run
python3 eval_seg_robust.py --load_checkpoint=best_model \
--num_points=10 \
--output_dir=output_robust/seg/num_points_10
modifying the path to the dumped weights (best_model.pt
is chosen by default). The following script is used to obtain results for the table above:
./q3_seg_eval.sh
The results along with sample outputs will be saved in output_robust/seg/num_points_10
.
3.2.2 Robustness to the world orientation
The model is evaluated with rotated point clouds. Here, I generted a 3x3
rotation matrix with rotation about the z-axis
with degrees in [-180, -90, ..., 90, 180]
as shown below (degrees=0 denotes no rotation).
degrees | -180 | -90 | -40 | -20 | -10 | 0 | 10 | 20 | 40 | 90 | 180 |
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy (%) | 36.33 | 34.39 | 61.25 | 78.62 | 86.86 | 89.85 | 86.14 | 76.13 | 60.49 | 45.54 | 36.33 |
The following can be observed from the table above:
With increasing angular deviation, the accuracy falls rapidly. For small deviations (upto 10 degrees) there is only a little drop (around ~3%), however for larger deviations (especially right angle rotations) the accuracy takes a hit.
Clockwise and Anticlockwise rotations have different effects on the accuracy (compare 90 and -90 degrees for instance). This also suggests that there is assymetry in the objects.
A simple sanity check from the table: for 180 & -180 degrees, the performance is identical as both the rotations yield the same configuration.
Implementation. To reproduce the results, run
python3 eval_seg_robust_rotate.py --load_checkpoint=best_model \
--degrees=10 \
--output_dir=output_robust_rotate/seg/degrees_10
modifying the path to the dumped weights (best_model.pt
is chosen by default). The following script is used to obtain results for the table above:
./q3_seg_eval_rotate.sh
The results along with sample outputs will be saved in output_robust_rotate/seg/degrees_10
.