Presented: Dijing Zhang
Accuary:
Train: 0.9562 (after 5 epochs)
Test: 0.9430
Correct Prediciton
False Prediction
GT: Chair
Pred: Lamp
GT: Vase
Pred: Chair
GT: Lamp
Pred: Vase
As we can see here, the prediction accuarcy is quite high. But for several cases, since they have smiliar shape as another object or they are uncommon for their own kind, there will be failure. Like the first chair with very high back, it makes sense that it is classified as lamp.
Test Accuracy:
0.8493 (after 4 epoches)
Visualization
Good Prediction #1
Accuracy: 0.901
Ground Truth
Prediction
Good Prediciton #2
Accuracy: 0.9629
Ground Truth
Prediction
Good Prediction #3
Accuracy: 0.958
Ground Truth
Prediction
Bad Prediction #1
Accuracy: 0.2911
Ground Truth
Prediction
Bad Prediction #2
Accuracy: 0.5256
Ground Truth
Prediction
As we can see here, the prediction accuarcy is high. But for several cases, like the bad prediciton #1 and #2, they are uncommon for their own kind, their parts will be misclassified. Like the sofa with a cushion, the cushion itself is uncommon and it leans against the handle, so parts of it will be taken as handles.
Create transformation matrix by create a 4x4 matrix, which is:
|R t|
|0 1|
where R = |cos(θ) -sin(θ) 0| t = |0|
|sin(θ) cos(θ) 0| |0|
| 0 0 1| |0|
Then make points to homography points by adding 1 (make |x y z 1|). Then apply matrix multipication between homography points and transformation matrix.
Classification Accuracy:
Segmentation Accuracy:
Classification Accuracy:
Segmentation Accuracy:
PointNet++
Model Architecture
PointNet2ClsMsg(
(sa1): SetAbstractionMsg(
(conv_blocks): ModuleList(
(0): ModuleList(
(0): Conv2d(3, 32, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1))
)
(1): ModuleList(
(0): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
)
(2): ModuleList(
(0): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(64, 96, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1))
)
)
(bn_blocks): ModuleList(
(0): ModuleList(
(0): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): ModuleList(
(0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): ModuleList(
(0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(sa2): SetAbstractionMsg(
(conv_blocks): ModuleList(
(0): ModuleList(
(0): Conv2d(323, 64, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
)
(1): ModuleList(
(0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
)
(2): ModuleList(
(0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
)
)
(bn_blocks): ModuleList(
(0): ModuleList(
(0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): ModuleList(
(0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): ModuleList(
(0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(sa3): SetAbstraction(
(mlp_convs): ModuleList(
(0): Conv2d(643, 256, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1))
)
(mlp_bns): ModuleList(
(0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(fc1): Linear(in_features=1024, out_features=512, bias=True)
(bn1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(drop1): Dropout(p=0.4, inplace=False)
(fc2): Linear(in_features=512, out_features=256, bias=True)
(bn2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(drop2): Dropout(p=0.5, inplace=False)
(fc3): Linear(in_features=256, out_features=3, bias=True)
)
PointNet Accuracy: 0.9430
PointNet++ Accuracy: 0.9653
GT: Chair
Pred:Chair
The "lamp" chair is correctly classified with pointnet++. And we can see there is around 2% increasement in accuracy, compared with pointnet.
Model Architecture
PointNet2Seg(
(sa1): SetAbstractionMsg(
(conv_blocks): ModuleList(
(0): ModuleList(
(0): Conv2d(6, 32, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1))
)
(1): ModuleList(
(0): Conv2d(6, 64, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
)
(2): ModuleList(
(0): Conv2d(6, 64, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(64, 96, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1))
)
)
(bn_blocks): ModuleList(
(0): ModuleList(
(0): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): ModuleList(
(0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): ModuleList(
(0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(sa2): SetAbstractionMsg(
(conv_blocks): ModuleList(
(0): ModuleList(
(0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
)
(1): ModuleList(
(0): Conv2d(323, 128, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(128, 196, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(196, 256, kernel_size=(1, 1), stride=(1, 1))
)
)
(bn_blocks): ModuleList(
(0): ModuleList(
(0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): ModuleList(
(0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(196, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(sa3): SetAbstraction(
(mlp_convs): ModuleList(
(0): Conv2d(515, 256, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1))
)
(mlp_bns): ModuleList(
(0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(fp3): FeaturePropagation(
(mlp_convs): ModuleList(
(0): Conv1d(1536, 256, kernel_size=(1,), stride=(1,))
(1): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
)
(mlp_bns): ModuleList(
(0): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(fp2): FeaturePropagation(
(mlp_convs): ModuleList(
(0): Conv1d(576, 256, kernel_size=(1,), stride=(1,))
(1): Conv1d(256, 128, kernel_size=(1,), stride=(1,))
)
(mlp_bns): ModuleList(
(0): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(fp1): FeaturePropagation(
(mlp_convs): ModuleList(
(0): Conv1d(134, 128, kernel_size=(1,), stride=(1,))
(1): Conv1d(128, 128, kernel_size=(1,), stride=(1,))
)
(mlp_bns): ModuleList(
(0): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(conv1): Conv1d(128, 128, kernel_size=(1,), stride=(1,))
(bn1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(drop1): Dropout(p=0.5, inplace=False)
(conv2): Conv1d(128, 6, kernel_size=(1,), stride=(1,))
)
PointNet Accuracy: 0.8493
PointNet++ Accuracy: 0.8913
Good Prediction
PointNet Accuracy: 0.958
PointNet++ Accuracy: 0.9912
Ground Truth
Prediction
Bad Prediction
PointNet Accuracy: 0.2911
PointNet++ Accuracy: 0.4187
Ground Truth
Prediction
As we can see from the quantitative result and visualization, there are a great increase (5% in total) in segmentation task since pointnet++ take locality into consideration. For the good prediction, it perform better, even for the previous bad prediction, it improves.