16-889 Assignment 5: Point Cloud Classification and Segmentation

In this assignment, I implemented a PointNet based architecture to classify and segment point clouds. I then analyzed the robustness of the model against rotations and the number of points used. Lastly, I evaluated the effects of training a model that is more robust to rotations by including random rotations while training.


Two late days were used.

Classification Model

The first step was implementing the classification model. This model takes in a point cloud and classifies it between three classes: chair, vase, and lamp. The architecture is almost identical to the PointNet classification architecture except for there was no input transform or feature transform used.

I trained the model for 50 epochs with a batch size of 16 and a learning rate of 0.001 using an Adam optimizer. Below are the training losses and validation accuracies while training. Had I been given more time, I would have trained for more epochs as the neither the losses nor the accuracies had begun to plateau.


On the test set, the accuracy of my best model was 96.3%. Below are a few correctly predicted examples of each class.


Chair:


Vase:


Lamp:


Below are also a few failure cases for each class.


Chair (only one):


Vase:


Lamp:


It appears that the model is infers the class of the object from the course structure as opposed to the finer details. This is evident given that tall chairs and vases are classified as lamps. As well, wider vases are classified as chairs. As well, the model appears to have the most difficulty distinguishing circular lamps from vases. This is because of the similar structure of the two, as they both have round cylindrical components.

Segmentation Model

Next I implemented the segmentation architecture, again without any input or feature transforms. This network segments each point into one of six classes.


I trained the model for 250 epochs with a batch size of 16 and a learning rate of 0.001 using an Adam optimizer. Below are the training losses and validation accuracies while training. Again, I would have liked to have trained a little longer if I had more time, as the training loss was still decreasing.


On the test set, the accuracy of my best model was 90.4%. Below are a few of the most correctly predicted point clouds.


Pred:


GT:


Below are also a few of the worst performing predictions.


Pred:


GT:


The most noticeable aspect is it appears the 3D position of each point appears to determine the classification and not how each point relates to one another. In the failure cases, a large majority of the misclassified points are either at the top or bottom of the object and are all misclassfiied as the same target.

Robustness Analysis

First, I evaluated the classification and segmentation models using a different number of points. The different numbers of points used were 10000, 8000, 6000, 4000, 2000, 1000, 500, 250, 100, 50, and 10. For the classification model, no noticable difference was seen in performance until the number of points dropped to around 250. For the segmentation model, there was no noticeable effect until around 1000 points.


Next, I evaluated both models' robustness to rotation. I used rotation values of 0, 10, 45, 90, and 180 degrees. My expectation was neither model would be invariant to rotation. This is because absolution position of points play a significant role in the classification and segmentation, which are directly affected when rotated. My expectation proved to be correct. What was interesting was a 180 degree rotation performed better than a 90 degree rotation for the both networks. This can be explained because course shape is more similar between a completely inverted object than a 90 degree rotation one.


Rotation Invariance

Lastly, I wanted to attempt to make the models rotation invariant. To do this, I trained new classification and segmentation models using the same procedures describe above, except I added random rotation between 0 and 360 degrees for each batch. Below are the new rotation invariant results.


Red is with rotation invariance, blue is without.


As can be seen above, rotation invariance worked well for the classification network. While the accuracy at 0 degree rotation was less (91.7% compared to 96.3%), it stayed pretty constant across the full 180 degrees, and outperformed the non-rotation invariant model all degrees except 0. However, rotation invariance did not work well for the segmentation model. The highest accuracy at 0 degrees was 81.6%, lower than the 90.4% from the non-rotation invariant model, and the accuracy still dropped a significant amount when undergoing 90 and 180 degree rotations. Yet, there was some improvement, as the rotation inveriant model did outperform the non-rotation invariant model at the higher degrees.