Homework Number Five

16-889 Learning for 3D Vision
Ben Kolligs

Andrew ID: bkolligs

zero late days
Zero Late Days Used

Question 1

My best network made use of Conv1D layers and achieved an accuracy of 85.7%. Here are some correct predictions.

Sampled points
Chair
Sampled points
Vase
Sampled points
Lamp
And here are some incorrect predictions. It looks like the vase that the network though was a chair has a sort of recessed "lip" that the network may have thought was a sitting surface. The misclassified vase is a bit more interesting, because it is quite obviously a lamp. It's possible the network thought the cylindrical shade was the outer wall of a vase. The misclassified lamp makes sense, because the plants sticking out definitely could be the "stalk" of a lamp.
Sampled points
Predicted: Chair, Actual: Vase
Sampled points
Predicted: Vase, Actual: Lamp
Sampled points
Predicted: Lamp, Actual: Vase

Question 2

My best segmentation network had an accuracy of 80.0%, using an architecture similar to the classification network, but with a different output tensor shape. Here are some correct predictions:

Sampled points
Ground Truth
Sampled points
Prediction, Accuracy: 82.7%
Sampled points
Ground Truth
Sampled points
Prediction, Accuracy: 95.9%
Sampled points
Ground Truth
Sampled points
Prediction, Accuracy: 89.9%
And here are a couple incorrect predictions. The first one is interesting, because it is assigning the sides of this flat chair to be part of the "arms", which makes sense since the outside of the chair sort of "flares" out. Additionally it is considering aspects of the supporting structure of the seat to be part of the legs. The second chair doesn't have a very clear distinction between legs and seat, which appears to be what is confusing the network.
Sampled points
Ground Truth
Sampled points
Prediction, Accuracy: 53.4%%
Sampled points
Ground Truth
Sampled points
Prediction, Accuracy: 54.9%

Question 3

For the robustness analysis we conducted four experiments:

  1. Rotations of the input clouds
  2. Input different numbers of points

Rotations Test

For both models the procedure was the same: rotate the input data a certain amount and check the accuracy.
Rotation in degrees (X, Y, Z) Segmentation Accuracy Classification Accuracy
(0, 0, 0) 0.8001 0.8530
(30, 0, 0) 0.7214 0.6568
(60, 0, 0) 0.5989 0.3116
(90, 0, 0) 0.4195 0.1689
(60, 0, 60) 0.5614 0.3273
(30, 0, 80) 0.5720 0.5089
Sampled points
Rotated Lamp
Sampled points
Rotated chair
We can see that the networks are not very robust to rotation. The Segmentation network is definitely more robust, but as we approach 90 degrees in one axis, the accuracy falls. I suspect I am losing accuracy in the point cloud network because I am using convolutions that have knowledge of their neighbors, and they are not being processed independently.

Number of Points Test

For both the classification model and the segmentation model, the procedure was the same: Reduce the number of input points N to the evaluation functions by calling python eval_cls.py --num_points N, python eval_seg.py --num_points N.
Number of Points Segmentation Accuracy Classification Accuracy
10000 0.8001 0.8530
5000 0.8003 0.8436
1000 0.8005 0.8583
100 0.802 0.8583
It appears that both networks are quite robust to a lack of input data. An example of what this looks like is shown here:
Sampled points
Sparse Lamp
Sampled points
Sparse Segmentation Prediction