16889 Assignment 5: Point Cloud Classification and Segmentation

Presented: Dijing Zhang

Q1. Classification Model

Accuary:

Train: 0.9562 (after 5 epochs)

Test: 0.9430

Correct Prediciton

p1_0__gt_0.gif

p1_617__gt_1.gif

p1_719__gt_2.gif

False Prediction

GT: Chair

Pred: Lamp

p1_543_pd_2_gt_0.gif

GT: Vase

Pred: Chair

p1_622_pd_0_gt_1.gif

GT: Lamp

Pred: Vase

p1_726_pd_1_gt_2.gif

As we can see here, the prediction accuarcy is quite high. But for several cases, since they have smiliar shape as another object or they are uncommon for their own kind, there will be failure. Like the first chair with very high back, it makes sense that it is classified as lamp.

Q2. Segmentation Model

Test Accuracy:

0.8493 (after 4 epoches)

Visualization

Good Prediction #1

Accuracy: 0.901

Ground Truth

gt_exp.gif

Prediction

pred_exp.gif

Good Prediciton #2

Accuracy: 0.9629

Ground Truth

gt_exp_111_0.9629.gif

Prediction

pred_exp_111_0.9629.gif

Good Prediction #3

Accuracy: 0.958

Ground Truth

gt_exp_126_0.958.gif

Prediction

pred_exp_126_0.958.gif

Bad Prediction #1

Accuracy: 0.2911

Ground Truth

BAD_gt_exp_163_0.2911.gif

Prediction

BAD_pred_exp_163_0.2911.gif

Bad Prediction #2

Accuracy: 0.5256

Ground Truth

BAD_gt_exp_96_0.5256.gif

Prediction

BAD_pred_exp_96_0.5256.gif

As we can see here, the prediction accuarcy is high. But for several cases, like the bad prediciton #1 and #2, they are uncommon for their own kind, their parts will be misclassified. Like the sofa with a cushion, the cushion itself is uncommon and it leans against the handle, so parts of it will be taken as handles.

Q3. Robustness Analysis

Rotation analysis

Create transformation matrix by create a 4x4 matrix, which is:

|R t|
|0 1|
where R = |cos(θ)  -sin(θ)  0|   t = |0|
          |sin(θ)  cos(θ)   0|       |0| 
          |  0       0      1|       |0|

Then make points to homography points by adding 1 (make |x y z 1|). Then apply matrix multipication between homography points and transformation matrix.

Classification Accuracy:

  • degree=0: 0.9569
  • degree=30: 0.8206
  • degree=60: 0.3599
  • degree=90: 0.2644
  • degree=120: 0.2592
  • degree=150: 0.2550
  • degree=180: 0.5393

Segmentation Accuracy:

  • degree=0: 0.8493
  • degree=30: 0.7618
  • degree=60: 0.5963
  • degree=90: 0.4424
  • degree=120: 0.3239
  • degree=150: 0.3309
  • degree=180: 0.3383

Modify number of points

Classification Accuracy:

  • num_points=500: 0.9549
  • num_points=1000: 0.9612
  • num_points=5000: 0.9570
  • num_points=7000: 0.9580
  • num_points=10000: 0.9570

Segmentation Accuracy:

  • num_points=500: 0.8585
  • num_points=1000: 0.8553
  • num_points=5000: 0.8512
  • num_points=7000: 0.8499
  • num_points=10000: 0.8493

Locality

PointNet++

image.png

Classification

Model Architecture

PointNet2ClsMsg(
  (sa1): SetAbstractionMsg(
    (conv_blocks): ModuleList(
      (0): ModuleList(
        (0): Conv2d(3, 32, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1))
      )
      (1): ModuleList(
        (0): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
      )
      (2): ModuleList(
        (0): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(64, 96, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1))
      )
    )
    (bn_blocks): ModuleList(
      (0): ModuleList(
        (0): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ModuleList(
        (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): ModuleList(
        (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (sa2): SetAbstractionMsg(
    (conv_blocks): ModuleList(
      (0): ModuleList(
        (0): Conv2d(323, 64, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
      )
      (1): ModuleList(
        (0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
      )
      (2): ModuleList(
        (0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
      )
    )
    (bn_blocks): ModuleList(
      (0): ModuleList(
        (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ModuleList(
        (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): ModuleList(
        (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (sa3): SetAbstraction(
    (mlp_convs): ModuleList(
      (0): Conv2d(643, 256, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (fc1): Linear(in_features=1024, out_features=512, bias=True)
  (bn1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (drop1): Dropout(p=0.4, inplace=False)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (bn2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (drop2): Dropout(p=0.5, inplace=False)
  (fc3): Linear(in_features=256, out_features=3, bias=True)
)

PointNet Accuracy: 0.9430

PointNet++ Accuracy: 0.9653

GT: Chair

Pred:Chair

p1_543_pd_2_gt_0.gif

The "lamp" chair is correctly classified with pointnet++. And we can see there is around 2% increasement in accuracy, compared with pointnet.

Segmentation

Model Architecture

PointNet2Seg(
  (sa1): SetAbstractionMsg(
    (conv_blocks): ModuleList(
      (0): ModuleList(
        (0): Conv2d(6, 32, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1))
      )
      (1): ModuleList(
        (0): Conv2d(6, 64, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
      )
      (2): ModuleList(
        (0): Conv2d(6, 64, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(64, 96, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1))
      )
    )
    (bn_blocks): ModuleList(
      (0): ModuleList(
        (0): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ModuleList(
        (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): ModuleList(
        (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (sa2): SetAbstractionMsg(
    (conv_blocks): ModuleList(
      (0): ModuleList(
        (0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
      )
      (1): ModuleList(
        (0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(128, 196, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(196, 256, kernel_size=(1, 1), stride=(1, 1))
      )
    )
    (bn_blocks): ModuleList(
      (0): ModuleList(
        (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): ModuleList(
        (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): BatchNorm2d(196, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (sa3): SetAbstraction(
    (mlp_convs): ModuleList(
      (0): Conv2d(515, 256, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (fp3): FeaturePropagation(
    (mlp_convs): ModuleList(
      (0): Conv1d(1536, 256, kernel_size=(1,), stride=(1,))
      (1): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (fp2): FeaturePropagation(
    (mlp_convs): ModuleList(
      (0): Conv1d(576, 256, kernel_size=(1,), stride=(1,))
      (1): Conv1d(256, 128, kernel_size=(1,), stride=(1,))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (fp1): FeaturePropagation(
    (mlp_convs): ModuleList(
      (0): Conv1d(134, 128, kernel_size=(1,), stride=(1,))
      (1): Conv1d(128, 128, kernel_size=(1,), stride=(1,))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (conv1): Conv1d(128, 128, kernel_size=(1,), stride=(1,))
  (bn1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (drop1): Dropout(p=0.5, inplace=False)
  (conv2): Conv1d(128, 6, kernel_size=(1,), stride=(1,))
)

PointNet Accuracy: 0.8493

PointNet++ Accuracy: 0.8913

Good Prediction

PointNet Accuracy: 0.958

PointNet++ Accuracy: 0.9912

Ground Truth

gt_exp.gif

Prediction

pred_exp.gif

Bad Prediction

PointNet Accuracy: 0.2911

PointNet++ Accuracy: 0.4187

Ground Truth

gt_163_0.4187.gif

Prediction

pred_163_0.4187.gif

As we can see from the quantitative result and visualization, there are a great increase (5% in total) in segmentation task since pointnet++ take locality into consideration. For the good prediction, it perform better, even for the previous bad prediction, it improves.