Assignment 5

Paritosh Mittal (paritosm)

Late Days used : 5 days

1 Classification Model

Test accuracy of the best model :: 96.22%

Correct prediction visualizations


GT: Chair Pred: Chair	GT: Chair Pred: Chair	GT: Chair Pred: Chair	GT: Chair Pred: Chair	GT: Chair Pred: Chair

GT: Vase Pred: Vase	GT: Vase Pred: Vase	GT: Vase Pred: Vase	GT: Vase Pred: Vase	GT: Vase Pred: Vase

GT: Lamp Pred: Lamp	GT: Lamp Pred: Lamp	GT: Lamp Pred: Lamp	GT: Lamp Pred: Lamp	GT: Lamp Pred: Lamp

Incorrect prediction visualizations


GT: Chair Pred: Vase	GT: Vase Pred: Chair	GT: Vase Pred: Chair	GT: Vase Pred: Lamp

GT: Lamp Pred: Vase	GT: Lamp Pred: Vase	GT: Lamp Pred: Vase	GT: Lamp Pred: Vase

In the visualizations above, I notice that pointnet does a pretty reasonable job at predicting object classes given point clouds. On visualizing the failure cases, it is evident that such cases are quite tricky. Example, the first row, first column of a chair predicted as a vase actually has its geomerty very similar to the vases in the dataset. Similarly, there is quite variance in the Lamp class and these funky shapes confuse the model (as expected).

2 Segmentation Model

Test accuracy of the best model :: 90.10%

Good quality prediction visualizations

Ground Truth
Predictions
Accuracy	99.96%	99.28%	99.36%	99.55%	99.68%

Bad quality prediction visualizations

Ground Truth
Predictions
Accuracy	51.55%	50.87%	50.97%	53.84%	54.06%

Here I visualize five good and five bad predictions for pointcloud segmentation. Quantitatively the model does a good job performing segmentation. For failure cases, it is evident from the visualization that these chairs have quite different structure when compared to general notion of chairs. There is also no clear boundaries between legs and armrests (Column II, III, IV, and V). This confuses the model in a reasonable way. I arguably find the GT in Row III to be wrong and the prediction to actually be closer to my personal belief of correct segmentation. This ambiguity results in poor qualitative results.

3 Robustness Analysis

3.1 Analysis with respect to points

Here I consider the change in model performance with respect to number of points in a point cloud. I specify an argument --do3.1 in the eval scripts. Once invoked, I compute the test accuracy for objects represented by {100, 500, 1000, 5000, 10000} points.

For classification:

num points	100	500	1000	5000	10000 (Q1)
Accuracy	93.28%	95.69%	96.33%	96.33%	96.22%

I notice that accuracy does not fall considerably. This could be beacuse we use a global max pool and hence we do not need all points to detect an object category. Correct predictions can be made as long as we have enough evidence from critical locations that help dis-ambiguate between classes. Hence, for classification the numbers align with expectations.

For Segmentation:

num points	100	500	1000	5000	10000 (Q2)
Accuracy	78.44%	86.81%	88.87%	90.03%	90.10%

I notice that accuracy does fall considerably. This is mainly because with sparse points, effective segmentation is difficult as there is more ambiguity. Isolated points have less evidence from neighbors (global context) to enforce accuract predictions. I also notice that segmentation accuracy (per-point) improves as we increase samples. Hence, even for segmentation the numbers align with expectations.

3.2 Analysis with respect to rotations

Here I consider the change in model performance with respect to rotation of point cloud along Z axis.. I specify an argument --do3.2 in the eval scripts. Once invoked, I compute the test accuracy for objects rotated by {0, 15, 30, 45, 60, 90} degree angles.

For classification:

Rotation Angle	0 (Q1)	15	30	45	60	90
Accuracy	96.22%	90.76%	73.24%	50.26%	31.58%	20.56%

It is evident from the numbers that pointnet is not rotation invariant and there is significant reduction in performance with increase in rotation. For small angles, the drop is less (as expected). I suspect data augmentation can somewhat improve the performance along with estimating rotation to map points into a canonical axis aligned form.

For Segmentation:

Rotation Angle	0 (Q2)	15	30	45	60	90
Accuracy	90.10%	81.95%	69.11%	59.52%	50.69%	42.62%

It is evident from the numbers that pointnet is not rotation invariant for segmentation task and there is significant reduction in performance with increase in rotation.

Ground Truth
Predictions
Rotations	0	15	30	45	60	90

To ensure that rotations are being done correctly, I visualize for one shape the view of rotated object and it's prediction as we increase rotation.

4 Bonus Question

Here, I implement the DGCNN model for point cloud classification and segmentation. Specifically, I use the pytorch Geometric library (a library similar to Pytorch3d maintained for Graph Neural Networks) for this implementation. The model can be found in the dgcnn folder. Key changes include re-writing the dataloader with pytorchGeometric Dataloader, which includes 3D point positions as an object of Data class. This particular implementation takes a lot of GPU memory and hence I reduce the model's complexity for fast(er) training. The model is built using the DynamicEdgeConv blocks (each having MLP layers within).

For classification:

Model	PointNet	DGCNN
Accuracy	96.22%	97.59%

Visualization - Easy
GT	Chair	Vase	Lamp
PointNet	Chair	Vase	Lamp
DGCNN	Chair	Vase	Lamp
Visualization - Difficult
GT	Chair	Vase	Lamp
PointNet	Vase	Lamp	Vase
DGCNN	Chair	Lamp	Lamp

I observe that DGCNN gives slight improvement in test accuracy. Qualitatively, a few difficult cases (as visualized above) where pointnet fails, dgcnn is able to go a correct job in predicting the class.

For segmentation:

Model	PointNet	DGCNN
Accuracy	90.10%	89.49%

I observe that DGCNN slightly decreases the accuracy of segmentation. I believe this can be because I reduced the model complexity of dgcnn for faster processing.