(1) The test accuracy of my best model: 0.9780
(2) Visualize a few random test point clouds and mention the predicted classes for each (here I visualize all correct samples):
Predicted class: chair
Predicted class: vase
Predicted class: lamp
(2) Visualize at least 1 failure prediction for each class (chair, vase and lamp), and provide interpretation in a few sentences.
GT label: chair
Prediction: lamp
Interpretation: We can see this chair has only two legs and it is difficult to be seen as a chair even for human. It may looks like some lamp data in the training set.
GT label: vase
Prediction: chair
Interpretation: We can see this vase has two big legs and it looks like a chair. And there is not plants in the vase that can help model understand what it is. So the model fails in this sample.
GT label: lamp
Prediction: vase
Interpretation: We can see this lamp looks like a vase. Its top part looks like plants and the bottom looks like supporter. These elements may make the model see it as a vase.
(1) The test accuracy of my best model: 0.9029
(2) Visualize segmentation results of at least 5 objects (left: ground truth; right: prediction).
Good examples:
a. prediction accuracy: 0.9850
We can see this chair is easy to predict since its shape is simple.
b. prediction accuracy: 0.9857
Similar to above one, this chair's shape is clear and has no misleading parts.
c. prediction accuracy: 0.9905
Similar to above one, this chair's shape is clear and has no misleading parts. It is very easy to classify different parts.
Bad examples:
a. prediction accuracy: 0.4817
This sample's parts have no obvious borders and it is a sofa who structure is different from common chairs. It does not have 'legs'. These may be the reason that the model cannot work well.
b. prediction accuracy: 0.4590
This sample's parts also have no obvious borders (like the blue and yellow part in the back) and also has no 'legs' that is different from common chairs. This is a pretty special case of chair and the pillow even increases the difficulty. These may be the reason that the model cannot work well.
c. prediction accuracy: 0.5462
This sample's parts also have no obvious borders and also has no 'legs' that is different from common chairs. We can see the prediction is similar the second sample and seem also good. Actually this sample is very difficult to classify even for our humans. It's structure is very misleading. These may be the reason that the model cannot work well.
(1) task cls:
Original (no rotation): 0.9780
Rotate 30 degrees: 0.5037
Rotate 45 degrees: 0.5855
Rotate 60 degrees: 0.3578
(2) task seg:
Original (no rotation): 0.9029
Rotate 30 degrees: 0.4711
Rotate 60 degrees: 0.3426
Rotate 90 degrees: 0.2484
One fail example is as below (left: ground truth; right: prediction):
We can see that rotation can decrease the accuracy greatly which means the model is sensitive to rotation.
--num_points
when evaluating models in eval_cls.py
and eval_seg.py
). The test accuracy is as below:(1) task cls:
Original (10000 points): 0.9780
1000 points: 0.9717
500 points: 0.9675
100 points: 0.9255
50 points: 0.8195
10 points: 0.2918
(2) task seg:
Original (10000 points): 0.9029
1000 points: 0.8911
500 points: 0.8761
100 points: 0.8099
50 points: 0.7753
10 points: 0.6789
We can see that reduce number of points can also decrease accuracy. Reduce small number does not affect much but reduce large number of points will lead to poor performance.
I use the network of pointnet++ (I select Multi-scale grouping (MSG) version as in the paper) and I borrow some functions from the pytorch implement repo.
Task cls
Test accuracy of Q1: 0.9780
Test accuracy of pointnet++: 0.9790
GT label: vase
Prediction: vase (in Q1, this sample is predicted as chair wrongly)
Task seg
Test accuracy of Q2: 0.9029
Test accuracy of pointnet++: 0.9189
We can see that pointnet++ has better performance since it has locality information.