Assignment 4

Name: Mayank Agarwal

Andrew ID: mayankag

Late Days Used: 0

Late Days

Q1. Classification Model (40 points)

# Training Command
python train.py --task cls

Accuracy

# Evaluation Command
python eva_cls.py

Test Accuracy of best model: 97.69%

Visualizations (Correct Predictions)

Point Cloud Ground Truth Prediction
point_cloud Chair Chair
point_cloud Chair Chair
point_cloud Chair Chair
point_cloud Vase Vase
point_cloud Vase Vase
point_cloud Vase Vase
point_cloud Lamp Lamp
point_cloud Lamp Lamp
point_cloud Lamp Lamp

Visualizations (Failure Cases)

Point Cloud Ground Truth Prediction
point_cloud Chair Lamp
point_cloud Vase Lamp
point_cloud Vase Lamp
point_cloud Vase Lamp
point_cloud Lamp Vase
point_cloud Lamp Vase
point_cloud Lamp Vase

Observations

Only one chair object was misclassified as lamp as shown in the above visualizations. This sample does not have a seating platform and could be easily confused for not being a chair. The model also gets confused between classifying lamps and vases. Lamps are frequently confused for vases and vice-versa. This means that overall, chair is vastly different from lamp and vase categories and is easier to classify. Another reason for good accuracy on chairs class might be data imbalance. I observed that we have more training data for chairs than for other classes. On the other hand, there’s a lot of similarity in the structure and shape of lamps and vases, which might be the cause for the model’s confusion. Compared to chairs, there’s more diversity in lamps and vases, another cause for the network’s confusion.

Q2. Segmentation Model (40 points)

# Training Command
python train.py --task seg

Accuracy

# Evaluation Command
python eva_seg.py

Test Accuracy of best model: 90.35%

Visualizations (Most Accurate Predictions)

Prediction (left) v/s Ground Truth (right) Accuracy
pred_gt_segmentation 99.56%
pred_gt_segmentation 99.55%
pred_gt_segmentation 99.54%

Visualizations (Least Accurate Predictions)

Prediction (left) v/s Ground Truth (right) Accuracy
pred_gt_segmentation 42.36%
pred_gt_segmentation 46.75%
pred_gt_segmentation 48.74%

Observations

Above, I have visualized the most accurate and least accurate predictions separately. As we can see from the above, well-defined chairs (closer to mean chair) have clear part segmentation of back, armrest, seat, legs, etc. These are easier to segment, and have the highest prediction accuracy as depicted in the first table. On the other hand, more complicated chairs with loosely defined handles, backrest and seats have poor segmentation accuracy. Given the inherent ambiguity, it is difficult even for humans to correctly segment such designer chairs (or sofas), and the same can be expected from the network as well.

Q3. Robustness Analysis (20 points)

I have performed the following two robustness analysis experiments -

  1. I have evaluated the point clouds on fewer number of input points (points are randomly selected).
  2. I have rotated the point clouds (about the z-axis) by certain degrees and observed its effects on classification and segmentation.

Robustness Analysis for Classification

python eval_cls.py --num_points NUM_POINTS --rot ROTATION_DEGREE

Accuracy from Q1 is highlighted in each of the tables below (first row).

Changing number of points for evaluation (keeping rotation fixed)

Number of points per object Rotation (in degrees) Test Accuracy (Best model)
10000 0 97.69
5000 0 97.80
2500 0 97.69
1250 0 97.69
625 0 96.75
312 0 96.54
156 0 94.65
78 0 87.20
40 0 67.47
16 0 32.53

Interpretation

As we decrease the number of points, the accuracy drops by a small margin till we decrease the num_points to 312. It then substantially decreases as we decrease the number of points even further. This is also expected, since 156 points are not sufficient to define the geometry of the objects. As we see in the failure cases below, it is difficult even for humans to classify these objects correctly. For e.g. not enough points on the seats (a key feature) of the chairs are sampled, making it hard for the network to classify them correctly.

Visualizing failure cases

For very few points (num_points = 156), the model’s predictions have become more incorrect. We observe some Chair objects are incorrectly predicted as Vase or Lamp. These chairs were correctly classified when more number of points are used.

Point Cloud Ground Truth Prediction
point_cloud Chair Vase
point_cloud Chair Lamp
point_cloud Chair Lamp

On further decreasing the number of input points (num_points=78), some vases are incorrectly predicted as chairs.

Point Cloud Ground Truth Prediction
point_cloud Vase Chair

Changing rotation (keeping number of points fixed)

Number of points per object Rotation (in degrees) Test Accuracy (Best model)
10000 0 97.69
10000 5 97.17
10000 -5 97.17
10000 10 96.54
10000 -10 95.59
10000 20 84.58
10000 -20 91.08
10000 30 51.10
10000 -30 76.18
10000 90 23.82
10000 -90 24.24

Interpretation

From the above results, we can see that the network is sensitive to rotation and there’s a considerable drop in classification accuracy if we rotate the point clouds by 20 degrees (along the z-axis). This is also intuitive since a chair once rotated might be closer to the mean lamp orientation (which has some hanging artifacts). I have also visualized a few failure cases below.

Visualizing failure cases

When we rotate the point cloud by 20 degrees, we observe some chairs are now being misclassified as lamps. These chairs were correctly classified when orientation was axis-aligned.

Point Cloud Ground Truth Prediction
point_cloud Chair Lamp
point_cloud Chair Lamp
point_cloud Chair Lamp

Robustness Analysis for Segmentation

python eval_seg.py --num_points NUM_POINTS --rot ROTATION_DEGREE

Accuracy from Q2 is highlighted in each of the tables below (first row).

Changing number of points for evaluation (keeping rotation fixed)

Number of points per object Rotation (in degrees) Test Accuracy (Best model) Best Prediction (Prediction-Left v/s Ground Truth-Right) Worst Prediction (Prediction-Left v/s Ground Truth-Right)
10000 0 90.35 point_cloud point_cloud
5000 0 90.31 point_cloud point_cloud
2500 0 90.12 point_cloud point_cloud
1250 0 89.67 point_cloud point_cloud
625 0 88.40 point_cloud point_cloud
312 0 85.87 point_cloud point_cloud
156 0 82.14 point_cloud point_cloud
78 0 78.48 point_cloud point_cloud
40 0 75.31 point_cloud point_cloud
16 0 72.18 point_cloud point_cloud

Interpretation

Given the above results, I would say segmentation is robust to number of input points. Even with very few points, the segmentation model is correctly predicting the segmentation outputs for well-defined chairs with clearly visible parts (see second last column). As we decrease the number of points, we see that the model still performs well on well-defined chairs. But, it struggles with segmentation of difficult chairs with ambiguous segments.

Changing rotation (keeping number of points fixed)

Number of points per object Rotation (in degrees) Test Accuracy (Best model) Best Prediction (Prediction-Left v/s Ground Truth-Right) Worst Prediction (Prediction-Left v/s Ground Truth-Right)
10000 0 90.35 point_cloud point_cloud
10000 5 89.35 point_cloud point_cloud
10000 -5 89.39 point_cloud point_cloud
10000 10 86.54 point_cloud point_cloud
10000 -10 86.72 point_cloud point_cloud
10000 20 78.36 point_cloud point_cloud
10000 -20 77.99 point_cloud point_cloud
10000 30 70.28 point_cloud point_cloud
10000 -30 69.58 point_cloud point_cloud
10000 90 43.85 point_cloud point_cloud
10000 -90 41.21 point_cloud point_cloud

Interpretation

Upto a certain degree, segmentation model is robust to rotation as well. However, it’s accuracy considerably decreases when we rotate the point clouds by 90 degrees. The model seems to be assigning part labels solely based on their location and using less structural information. It tries to classify top as backrest (light blue), middle as seat (red), and bottom as legs (dark blue) even for the rotated point clouds. This is also expected because the model wasn’t trained for these transformations during training time. A good fix would be apply rotation augmentations during training.