Assignment 5¶

Late Days: 3¶

alt

Q1. Classification Model¶

Training:

python train.py --task cls

Evaluation (with visualization):

python eval_cls.py --load_checkpoint best_model --visualize

The model checkpoints will be stored in ./checkpoints/cls and visualizations in ./output/cls.

Test accuracy of best model: 96.85 %¶

Correct predictions¶

Input Point Cloud	Correct Prediction
	Lamp
	Lamp
	Vase
	Vase
	Chair
	Chair

Incorrect Predictions¶

Interestingly, all the chairs in the test set were correctly classified by the model. The only misclassified examples were from the lamp and vase categories.

Input Point Cloud	Ground Truth	Prediction
	Lamp	Vase
	Lamp	Vase
	Vase	Lamp
	Vase	Lamp

Interpretation¶

No misclassified chairs shows that the distribution of chairs is the most distinct compared to that of lamps and vases. Indeed, looking through the dataset, this does seem to be the case visually.

The inter-class confusion between lamp and vase categories can be attributed to some overlap in the distribution of these two categories. For instance, looking at the first two point clouds, even I as a human am inclined to label them as vases. I don't see a bulb or other source of illumination in either point cloud and am left wondering whether this was an incorrect label.

Even in the fourth point cloud, I am almost halfway split between whether this is a vase or a lamp. The bulbous shape of the vase certainly looks like something that may emit light in a lamp. Similarly, the flower/blossom object in the third point cloud looks very similar to a bulb. If it weren't for the leaf behind it, I might also have labelled it a lamp.

Q2. Segmentation Model¶

Training:

python train.py --task seg

Evaluation (with visualization):

python eval_seg.py --load_checkpoint best_model --visualize

The model checkpoints will be stored in ./checkpoints/seg and visualizations in ./output/seg. Each output gif's prefix is the number of correctly classified points in the point cloud.

Test accuracy of best model: 89.92 %¶

Good Predictions¶

Ground Truth	Prediction	Accuracy
		99.72 %
		99.56 %
		99.53 %
		99.50 %
		99.24 %

Bad Predictions¶

Ground Truth	Prediction	Accuracy
		41.52 %
		54.21 %
		55.25 %
		62.49 %

Interpretation¶

Firstly, all the "bad" predictions seem to be of chairs that are very different from the image one may imagine of a canonical/typical chair, and thus also seem to be outliers with respect to the distribution of chairs in our dataset. Secondly, there seems to be some ambiguity in the definition of the different segments of the chair depending on each instance, which even I as a human have difficulty understanding.

For instance, in the first image, the model predicted the lower half of the chair as "base", whereas the ground truth shows it as part of the armrest. This seems like an unnatural choice to me, because as humans we would be more likely to segment the bottom half of this chair as the base rather than an extension of the armrest.

This phenomenon is again exemplified in the second image, where the model predicts the extent of the armrests to be much lower in the object, but is penalized because the ground truth now calls the lower region a base.

The third point cloud seems like a far outlier because its structure doesn't naturally decompose into a base, headrest, armrest, etc as some of the other chairs in the dataset.

Finally, the fourth point cloud is also an out-of-distribution example as it is a folded chair as opposed to most of the other objects which show a chair in its full expanded extent.

Q3. Robustness Analysis¶

3.1 Vary number of points per object¶

Input a different number of points per object (--num_points) when evaluating models than the model was actually trained on.

Experiment 1¶

Classification model¶

python eval_cls.py --load_checkpoint best_model --num_points NUM

where NUM is the value (one of those below).

Num Points	Accuracy
10000	96.85 %
8000	96.64 %
5000	96.85 %
2500	96.95 %
1000	96.01 %

The accuracy from Q1 is in the first row of the table above.

Experiment 2¶

Segmentation model¶

python eval_seg.py --load_checkpoint best_model --num_points NUM

where NUM is the value (one of those below).

Num Points	Accuracy
10000	89.92 %
8000	89.92 %
5000	89.93 %
2500	89.66 %
1000	88.74 %

The accuracy from Q2 is in the first row of the table above.

Thus, we conclude that both classification and segmentation models are fairly robust to the number of sampled points.¶

3.2 Rotate the input point clouds¶

During evaluation, I rotated the points about the X-axis because the points were fairly spread out along this axis. Rotating about the Y-axis didn't affect the point clouds much (visually) and rotating about the Z-axis showed similar orientation as rotating about the X-axis. The rotation is performed using a pytorch3d transformation object.

Experiment 3¶

Classification model¶

python eval_cls.py --load_checkpoint best_model --visualize --rotate DEGREES --exp_name NAME

where DEGREES is how many degrees we want to rotate the input by. The visualizations will be stored in ./output/NAME.

Rotation (degrees)	Example input	Accuracy
0		96.85 %
15		92.24 %
30		77.54 %
45		54.25 %
60		32.32 %
75		27.91 %
90		27.38 %

The accuracy from Q1 is in the first row of the table above.

Experiment 4¶

Segmentation model¶

python eval_seg.py --load_checkpoint best_model --visualize --rotate DEGREES --exp_name NAME

where DEGREES is how many degrees we want to rotate the input by. The visualizations will be stored in ./output/NAME.

Rotation (degrees)	Example ground truth	Accuracy
0		89.92 %
15		83.40 %
30		70.92 %
45		49.41 %
60		34.50 %
75		30.20 %
90		26.39 %

The accuracy from Q2 is in the first row of the table above.

Thus, we can see that both the classification and segmentation models are robust to rotation of the input point clouds until about 15 degrees, beyond which their accuracies decay rapidly.¶