Assignment 5 of Learning for 3D Vision (16-889)

Q1. Classification Model (40 points)

python train.py --task cls

Test accuracy: 0.9507

Correct predictions: Chair

Correct predictions: Vase

Correct predictions: Lamp

Incorrect prediction: prediction : Chair ; GT label : Vase

Incorrect predictions: prediction : Vase ; GT label : Lamp

Incorrect predictions: prediction : Lamp ; GT label : Vase

Interpretation for Predcitions: Visualizing and analysing incorrect predctions gives some clues about why the model is making wrong prediction. For instance for the above examples, we see that vase and lamps are that are getting incorrect predictions have certain geometric similarities. Hence if the sample has these local shapes that are common to other classes then the model might give wrong prediction. Note: this model doesn't have any incorrect predictions for GT class chair, meaning all chair samples are getting classified as chair.

Q2. Segmentation Model (40 points)

python eval_seg.py

Test accuracy: 0.8890

Segmentation accuracy   |    GT segmentation    |      Predicted segmentation

segmentation accuracy: 96.6

segmentation accuracy: 97.9

segmentation accuracy: 98.4

Bad segmentation accuracy: 44.6

Bad segmentation accuracy: 49.8

Interpretation for Predcitions: Visualizing and analysing low accuracy predctions gives some clues about on which samples the model is performing bad. For instance for the above examples, we see that model is performing worse on chairs with more than 3 gt parts and non-trivial chair mean shape.

Q3. Robustness Analysis (20 points)

Different number of points as input:

Num Points Cls Accuracy Seg Accuracy
10000 95.07 88.90
5000 94.54 88.86
1000 94.53 88.21
100 92.86 81.54

Different angle of rotation along y axis of points as input:

Angle of rotation along Y axis (in degree) Cls Accuracy Seg Accuracy
0 95.07 88.90
45 75.97 71.11
90 67.78 51.96

Q4. Bonus Question - Locality (20 points)

With pointNet++ model, we get the following performance:

python train.py --task cls --use_plus

The classification pointNet++ architecture has the following architecture: 2 layers of subsampling and grouping. where the 1st layer has with 512 sampled points with grouping radius of 0.1 for 32 nearest neighbour points. where the 2nd layer has with 128 sampled points with grouping radius of 0.4 for 32 nearest neighbour points.

Classification accuracy : 95.38

python train.py --task cls --use_plus

The segmentation pointNet++ architecture has similar architecture as above with skip connections and interpolations in the final segmentation branch.

Segmentation accuracy : 80.79

Number of late days 3