Test accuracy of the best model :: 96.22%
Correct prediction visualizations
![]() |
![]() |
![]() |
![]() |
![]() |
GT: Chair Pred: Chair | GT: Chair Pred: Chair | GT: Chair Pred: Chair | GT: Chair Pred: Chair | GT: Chair Pred: Chair |
![]() |
![]() |
![]() |
![]() |
![]() |
GT: Vase Pred: Vase | GT: Vase Pred: Vase | GT: Vase Pred: Vase | GT: Vase Pred: Vase | GT: Vase Pred: Vase |
![]() |
![]() |
![]() |
![]() |
![]() |
GT: Lamp Pred: Lamp | GT: Lamp Pred: Lamp | GT: Lamp Pred: Lamp | GT: Lamp Pred: Lamp | GT: Lamp Pred: Lamp |
Incorrect prediction visualizations
![]() |
![]() |
![]() |
![]() |
GT: Chair Pred: Vase | GT: Vase Pred: Chair | GT: Vase Pred: Chair | GT: Vase Pred: Lamp |
![]() |
![]() |
![]() |
![]() |
GT: Lamp Pred: Vase | GT: Lamp Pred: Vase | GT: Lamp Pred: Vase | GT: Lamp Pred: Vase |
In the visualizations above, I notice that pointnet does a pretty reasonable job at predicting object classes given point clouds. On visualizing the failure cases, it is evident that such cases are quite tricky. Example, the first row, first column of a chair predicted as a vase actually has its geomerty very similar to the vases in the dataset. Similarly, there is quite variance in the Lamp
class and these funky shapes confuse the model (as expected).
Test accuracy of the best model :: 90.10%
Good quality prediction visualizations
Ground Truth | ![]() |
![]() |
![]() |
![]() |
![]() |
Predictions | ![]() |
![]() |
![]() |
![]() |
![]() |
Accuracy | 99.96% | 99.28% | 99.36% | 99.55% | 99.68% |
Bad quality prediction visualizations
Ground Truth | ![]() |
![]() |
![]() |
![]() |
![]() |
Predictions | ![]() |
![]() |
![]() |
![]() |
![]() |
Accuracy | 51.55% | 50.87% | 50.97% | 53.84% | 54.06% |
Here I visualize five good and five bad predictions for pointcloud segmentation. Quantitatively the model does a good job performing segmentation. For failure cases, it is evident from the visualization that these chairs have quite different structure when compared to general notion of chairs. There is also no clear boundaries between legs and armrests (Column II, III, IV, and V). This confuses the model in a reasonable way. I arguably find the GT in Row III to be wrong and the prediction to actually be closer to my personal belief of correct segmentation. This ambiguity results in poor qualitative results.
Here I consider the change in model performance with respect to number of points in a point cloud. I specify an argument --do3.1
in the eval scripts. Once invoked, I compute the test accuracy for objects represented by {100, 500, 1000, 5000, 10000} points.
For classification:
num points | 100 | 500 | 1000 | 5000 | 10000 (Q1) |
Accuracy | 93.28% | 95.69% | 96.33% | 96.33% | 96.22% |
I notice that accuracy does not fall considerably. This could be beacuse we use a global max pool and hence we do not need all points to detect an object category. Correct predictions can be made as long as we have enough evidence from critical locations that help dis-ambiguate between classes. Hence, for classification the numbers align with expectations.
For Segmentation:
num points | 100 | 500 | 1000 | 5000 | 10000 (Q2) |
Accuracy | 78.44% | 86.81% | 88.87% | 90.03% | 90.10% |
I notice that accuracy does fall considerably. This is mainly because with sparse points, effective segmentation is difficult as there is more ambiguity. Isolated points have less evidence from neighbors (global context) to enforce accuract predictions. I also notice that segmentation accuracy (per-point) improves as we increase samples. Hence, even for segmentation the numbers align with expectations.
Here I consider the change in model performance with respect to rotation of point cloud along Z axis.. I specify an argument --do3.2
in the eval scripts. Once invoked, I compute the test accuracy for objects rotated by {0, 15, 30, 45, 60, 90} degree angles.
For classification:
Rotation Angle | 0 (Q1) | 15 | 30 | 45 | 60 | 90 |
Accuracy | 96.22% | 90.76% | 73.24% | 50.26% | 31.58% | 20.56% |
It is evident from the numbers that pointnet is not rotation invariant and there is significant reduction in performance with increase in rotation. For small angles, the drop is less (as expected). I suspect data augmentation can somewhat improve the performance along with estimating rotation to map points into a canonical axis aligned form.
For Segmentation:
Rotation Angle | 0 (Q2) | 15 | 30 | 45 | 60 | 90 |
Accuracy | 90.10% | 81.95% | 69.11% | 59.52% | 50.69% | 42.62% |
It is evident from the numbers that pointnet is not rotation invariant for segmentation task and there is significant reduction in performance with increase in rotation.
Ground Truth | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Predictions | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Rotations | 0 | 15 | 30 | 45 | 60 | 90 |
To ensure that rotations are being done correctly, I visualize for one shape the view of rotated object and it's prediction as we increase rotation.
Here, I implement the DGCNN model for point cloud classification and segmentation. Specifically, I use the pytorch Geometric library (a library similar to Pytorch3d maintained for Graph Neural Networks) for this implementation. The model can be found in the dgcnn
folder. Key changes include re-writing the dataloader with pytorchGeometric Dataloader, which includes 3D point positions as an object of Data
class. This particular implementation takes a lot of GPU memory and hence I reduce the model's complexity for fast(er) training. The model is built using the DynamicEdgeConv blocks (each having MLP layers within).
For classification:
Model | PointNet | DGCNN |
Accuracy | 96.22% | 97.59% |
Visualization - Easy | ![]() |
![]() |
![]() |
GT | Chair | Vase | Lamp |
PointNet | Chair | Vase | Lamp |
DGCNN | Chair | Vase | Lamp |
Visualization - Difficult | ![]() |
![]() |
![]() |
GT | Chair | Vase | Lamp |
PointNet | Vase | Lamp | Vase |
DGCNN | Chair | Lamp | Lamp |
I observe that DGCNN gives slight improvement in test accuracy. Qualitatively, a few difficult cases (as visualized above) where pointnet fails, dgcnn is able to go a correct job in predicting the class.
For segmentation:
Model | PointNet | DGCNN |
Accuracy | 90.10% | 89.49% |
I observe that DGCNN slightly decreases the accuracy of segmentation. I believe this can be because I reduced the model complexity of dgcnn for faster processing.