Assignment 3. Neural Surfaces

Course: 16-889 Learning for 3D Vision

Name: Soyong Shin

Due by: Apr. 1 (Fri)

Contents:

1. Sphere Tracing (30pts)

The sphere tracing method was implemented in a4/renderer.py

The configuration I used is:

renderer:
	type: sphere_tracing
	chunk_size: 8192
	near: 0.0
	far: 5.0
	max_iters: 64

I noticed that in some case, distance goes negative, so the masking was only given using far

Results is shown as below

Figure 1. Sphere tracing resutls

2. Optimizing a Neural SDF (30pts)

The code is implemented in a4/implicit.py and a4/losses.py.

For Distance loss and the Eikonal loss, I use L2 norm as the loss.

The optimization configuration is given as the basic configuration file.

The final results of the optimization is as below.

Figure 2. Results of SDF fitting on point cloud
Figure 3. Optimization results over epochs

3. VolSDF (30 pts)

3.1. Experiments

Exp 1. Beta: 0.05

Figure 4. Results of geometry reconstruction
Figure 5. Results of volume rendering

Figure 6. Geometry reconstruction over time
Figure 7. Volume rendering over time

Exp 2. Beta = 0.01 (Lower)

Figure 8. Geometry reconstruction over time
Figure 9. Volume rendering over time

Exp 3. Beta = 0.5 (Higher)

Figure 10. Geometry reconstruction over time
Figure 11. Volume rendering over time

3.2. Interpretations

1. How does high beta bias your learned SDF? What about low beta?

When I use higher beta, the reconstruction of surface does not go well. Since the sharpness of the density function is positive proportion to 1β\frac{1}{\beta}, higher beta makes the surface very blurly. On contrary, lower beta is able to generate sharper, detailed surface.

In the above figures, beta=0.5 shows very poor performance in geometry reconstruction while beta=0.01 has better performance in overall.

2. Would an SDF be easier to train with volume rendering and low beta or high beta? Why?

In terms of convergence, higher beta is easier to train the model. Since low beta enforces the network to define the object surface more clearly, neural network might not be easily optimized for the complex geometries. Although higher beta has better train convergence, it does not mean it shows better quality.

3. Would you be more likely to learn an accurate surface with high beta or low beta? Why?

If I want the network to learn the accurate surface, I would not use high beta but use low beta. Low beta sharply classify the surface as the density function drops rapdily outside the surface. This will enable the network to be aware of the accurate surface geometry, while high beta just assume the overall, blurly shape of the surface.

4. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)

4.1. Render a Large Scene with Sphere Tracing (10 pts)

For this task, I trained my own SDF fitted on human body meshes. Each mesh is built by SMPL model where SMPL parameters are sampled from AMASS dataset.

Once I trained N SDFs, I built new Model of which implicit_fn is actually the ModuleList of several SDFs. During get_distance pass, I calculated distances to all SDFs, and get the minimum value.

Figure 12. Sphere tracing over two human SDFs

Since the trained SDF is not perfectly fitted and there are some negative distance points (inside the surface) outside the human body, it is too noisy to place multiple people at once. I would train SDFs with better constraints (e.g., any points outside certain human body boundary should be bigger than 0) later.

4.2. Fewer Training Views (10 pts)

For this task, I use 20 images instead of 100 to train VolSDF network. To choose the view, I select image uniformly across their indices, assuming the indexing was done by the pose of camera (i.e., close indices means close pose).

Figure 13. Geometry reconstruction over time
Figure 14. Volume rendering over time

Figure 15. Results while using 20 images
Figure 16. Results while using 100 images

I also campared the results with NeRF reconstruction. Identical to this experiment, NeRF is also trained with both 200 and 100 images respectively.

Figure 17. Results while using 20 images (NeRF)
Figure 18. Results while using 100 images (NeRF)