Assignment 5
Anirudh Chakravarthy (achakrav)
Question 1
Usage:
python train.py --task cls
python eval_cls.py --load_checkpoint best_model
Test accuracy: 97.16%
Class | Correct | Wrong |
---|---|---|
Chair | ![]() |
![]() |
Vase | ![]() |
![]() |
Lamp | ![]() |
![]() |
- The chair gets misclassified since it doesn't really have an sittable area and it's hard to conclusively say even for a human that it is a chair. It is classified as a vase probably due to the cavity between the two planes.
- The vase has a general structure similar to a chair with a flat sittable plane and an upright plane for back support which explains why it gets misclassified.
- The lamp is misclassified as it doesn't really look like a lamp. The bulging structure is similar to that seen in vases and hence it's classified as a vase.
Question 2
Usage:
python train.py --task seg
python eval_seg.py --load_checkpoint best_model
Test accuracy: 89.86%
Description | Pred | GT | Acc |
---|---|---|---|
Good | ![]() |
![]() |
0.9949 |
Good | ![]() |
![]() |
0.9845 |
Good | ![]() |
![]() |
0.9823 |
Bad | ![]() |
![]() |
0.4435 |
Bad | ![]() |
![]() |
0.4714 |
Bad | ![]() |
![]() |
0.4991 |
The segmentation network seems to perform well on examples with few prominent classes. In the first 3 examples, there are a very few substructures (3-4) within each point cloud and the network segments these large regions well. However, when there exist multiple substructures i.e., the object becomes more cluttered, the network does not perform well, perhaps due to structural confusions around those regions.
Question 3
Experiment 1
Usage:
python eval_cls.py --load_checkpoint best_model --noise
python eval_seg.py --load_checkpoint best_model --noise
I added uniformly sampled noise in the range [0, alpha] to each point in the point cloud. For high values of alpha, this corrupts the point cloud completely, but for low values, this gives us an idea of the robustness of the network towards random shifts in points.
Cls:
Alpha | Accuracy |
---|---|
0 | 97.16% |
0.1 | 94.54% |
0.2 | 91.71% |
0.5 | 58.38% |
1 | 22.24% |
Seg:
Alpha | Accuracy |
---|---|
0 | 89.86% |
0.1 | 86.61% |
0.2 | 79.61% |
0.5 | 48.71% |
1 | 21.31% |
Experiment 2
Usage:
python eval_cls.py --load_checkpoint best_model --dropout
python eval_seg.py --load_checkpoint best_model --dropout
I was inspired to perform this experiment from pixel attribution techniques and the following blog: https://christophm.github.io/interpretable-ml-book/pixel-attribution.html. Specifically, each point has a saliency associated with it, given by the norm of the gradient with respect to that point. If a point has a high gradient norm, changing this point would lead to large change in the outputs, which in turn signifies that the point is very crucial to the output.
For classification, I computed the gradient of the classification prediction with respect to each of the points. For the points which have non-zero gradient norms, I discarded the top k% among these points. Intuitively, on discarding more and more of the important points, the performance will decrease.
k | Accuracy |
---|---|
0% | 97.16% |
10% | 97.06% |
20% | 96.95% |
30% | 97.06% |
50% | 97.06% |
70% | 97.16% |
80% | 97.16% |
Even on discarding a large portion of the most crucial points, the network still retains a respectable accuracy. Therefore, the network seems to be robust.
For segmentation, I followed a similar process where the gradients are computed for the per-point prediction with respect to each point. Intuitively, the points with high gradient norm are those which are brittle to changes and small changes to the input would change the segmentation outputs, i.e. the gradient norm tells us how brittle the segmentation output is with respect to the given point.
k | Accuracy |
---|---|
0% | 89.86% |
10% | 89.83% |
20% | 89.82% |
30% | 89.92% |
50% | 90.31% |
70% | 90.18% |
80% | 90.05% |
Clearly, removing these points leads to a slight increase in output. Since the increment is very slight, this means that performance on these points is also usually pretty good and therefore, the network is fairly robust.