CMU 2022 Spring 16-889

Assignment 5: Point Cloud Classification and Segmentation


1. Classification Model

Here, I visualize several test point clouds and their corresponding predicted class. The model could get most of the prediction right.

Still, there are some data that the network predict it wrong. Here, I provide a failure case for each class. It turns out these samples are quite difficult to classify, even for a human. In the first sample, the tall chair looks like a floor lamp, such as the one shown above. For the second sample, we couldn't really tell whether it's a lamp or a elegantly-shaped vase. As for the third one, the lamp looks pretty much like a lamp. In short, I think we couldn't really blame the network for predicting wrong.

2. Segmentation Model

Here, I visualize several test point clouds and their segmentation result. On the left is the ground truth, on the right is prediction. The model pretty much nail the prediction, though it get some small detail wrong. For example, the model over-extend the chair arm (yellow) prediction into the seating (red) in the first example. This is because the boundary of chair arm and seating is fundamentally very difficult to localize. In the second example, the seating is predicted wrong, as the model extend the seating far down into the chair leg part (blue). Possibly because this chair has very different styled leg, not the typical pillar like chair leg. The third example is pretty much correct.

Here, I provide some samples that have low accuracy down below. On the left is the ground truth, on the right is prediction.

The first example is pretty bad, but the ground truth is a rather difficult example for segmentation. The heading position (magenta) is connected with chair back, which is why the network get it wrong. The left chair arm (yellow) is not really obvious even for human. The chair leg (blue) only account for a small area. No wonder the model get it wrong so badly.

The second example is also a difficult one. First, whether the pillow count as seating or chair back is ambiguous. Second, the chair leg is actuaclly an extension of chair back and chair arm. The model has a difficult time segmenenting a single shape into multiple categories.

3. Robustness Analysis

Exp 1: Number of points

First, I analyze how the model would behave under different number of points. I perform this experiment by subsampling a number of points from each sample.

Classification

# of points accuracy
10000 0.9780
5000 0.9370
2000 0.9391
1000 0.9339
500 0.9391
200 0.9328
100 0.9129

Segmentation

# of points accuracy
10000 0.9036
5000 0.9033
2000 0.9023
1000 0.8965
500 0.8832
200 0.8468
100 0.8142

In theory, decreasing the number of points would break the shape of the object, and decrease the accuracy. We could see this trend in both classification and segmentation model. But to my surprise, the accuracy holds pretty well if the point are decreased to ~1000. This shows that the model is robust in both classication and segmentation regardless of the points sampled for an object.

Exp 2: Rotation

Next, I analyze the model robustness to rotation. I perform this experiment by rotating all objects with certain amount of angles around x, y, z axis. The results are shown below.

Classification (rotate around x-axis)

rotate angle accuracy
0 0.9780
30 0.6884
60 0.2046
90 0.2854

Segmentation (rotate around x-axis)

rotate angle accuracy
0 0.9780
30 0.7959
60 0.5273
90 0.2406

Classification (rotate around y-axis)

rotate angle accuracy
0 0.9780
30 0.9087
60 0.7597
90 0.4764

Segmentation (rotate around y-axis)

rotate angle accuracy
0 0.9036
30 0.7843
60 0.6603
90 0.5633

Classification (rotate around z-axis)

rotate angle accuracy
0 0.9780
30 0.8573
60 0.7282
90 0.2812

Segmentation (rotate around z-axis)

rotate angle accuracy
0 0.9036
30 0.6910
60 0.5339
90 0.3820

In theory, the pointnet-like model is suppose to be rotation invariant. But because the assignment does not require us to implement the T-net that the original author use to deal with rotation, the network has poor performance to handle rotation. The accuracy drops significantly when the rotation angle increases. Also, segmentation tends to drop more accuracy than classification. This is because classification is a task that does not depend on the localization of points, while segmentation need to interpret the position of each point. Last thing, rotating around z-axis drops less performance. This is possibly because the dataset contain lots of symmetric object, therefore some performance is retain from this fact.