Question 1
My best network made use of Conv1D
layers and achieved an accuracy of 85.7%.
Here are some correct predictions.
And here are some incorrect predictions. It looks like the vase that the network though was a chair
has a sort of recessed "lip" that the network may have thought was a sitting surface. The
misclassified vase is a bit more interesting, because it is quite obviously a lamp. It's possible
the network thought the cylindrical shade was the outer wall of a vase. The misclassified lamp makes
sense, because the plants sticking out definitely could be the "stalk" of a lamp.
Predicted: Chair, Actual: Vase
Predicted: Vase, Actual: Lamp
Predicted: Lamp, Actual: Vase
Question 2
My best segmentation network had an accuracy of 80.0%, using an architecture similar to the
classification network, but with a different output tensor shape.
Here are some correct predictions:
Ground Truth
Prediction, Accuracy: 82.7%
Ground Truth
Prediction, Accuracy: 95.9%
Ground Truth
Prediction, Accuracy: 89.9%
And here are a couple incorrect predictions. The first one is interesting, because it is assigning
the sides of this flat chair to be part of the "arms", which makes sense since the outside of the
chair sort of "flares" out. Additionally it is considering aspects of the supporting structure of
the seat to be part of the legs.
The second chair doesn't have a very clear distinction between legs and seat, which appears to be
what is confusing the network.
Ground Truth
Prediction, Accuracy: 53.4%%
Ground Truth
Prediction, Accuracy: 54.9%
Question 3
For the robustness analysis we conducted four experiments:
- Rotations of the input clouds
- Input different numbers of points
Rotations Test
For both models the procedure was the same: rotate the input data a certain amount and check the
accuracy.
Rotation in degrees (X, Y, Z) |
Segmentation Accuracy |
Classification Accuracy |
(0, 0, 0) |
0.8001 |
0.8530 |
(30, 0, 0) |
0.7214 |
0.6568 |
(60, 0, 0) |
0.5989 |
0.3116 |
(90, 0, 0) |
0.4195 |
0.1689 |
(60, 0, 60) |
0.5614 |
0.3273 |
(30, 0, 80) |
0.5720 |
0.5089 |
Rotated Lamp
Rotated chair
We can see that the networks are not very robust to rotation. The Segmentation network is definitely
more robust, but as we approach 90 degrees in one axis, the accuracy falls. I suspect I am losing
accuracy in the point cloud network because I am using convolutions that have knowledge of their
neighbors, and they are not being processed independently.
Number of Points Test
For both the classification model and the segmentation model, the procedure was the same: Reduce the
number of input points to the evaluation functions by calling
python eval_cls.py --num_points N
, python eval_seg.py --num_points N
.
Number of Points |
Segmentation Accuracy |
Classification Accuracy |
10000 |
0.8001 |
0.8530 |
5000 |
0.8003 |
0.8436 |
1000 |
0.8005 |
0.8583 |
100 |
0.802 |
0.8583 |
It appears that both networks are quite robust to a lack of input data.
An example of what this looks like is shown here:
Sparse Lamp
Sparse Segmentation Prediction