Loading [MathJax]/extensions/Safe.js

Late Days Used: 4

alt

1. Sphere Tracing

python -m a4.main --config-name=torus

The output is written at images/part_1.gif.

alt

Description of implementation

I initialize t for all rays to be near (the specified lower limit). Then, I keep incrementing t for each ray (i.e., marching along the ray) by the SDF at the point p = origin + t * direction. I keep marching along each ray in this way until we hit the terminating condition for the ray, which is at least one of the following:

This gives us the value of t for every ray. The mask for every ray can be computed as:

Then, for every ray where the mask is true, we can compute the intersection of the ray with the surface as the point p = origin + t * direction.

2. Optimizing a Neural SDF

python -m a4.main --config-name=points

The output is written at images/part_2_input.gif and images/part_2.gif.

alt

MLP: Simple feedforward network with only fully connected layers:

Eikonal Loss: $$ \frac{1}{N} \sum_p | \|\nabla_p f\|_2 - 1 | $$

where $f$ = SDF, $p$ = 3D point, $N$ = number of points.

The eikonal loss ensures that the norm of the gradient at every point is unity. The further the norm of the gradient is from one, the larger this loss is. This constrains the network to learn an SDF instead of an arbitrary function.

3. VolSDF

Intuitive explanation of alpha, beta:

  1. How does high beta bias your learned SDF? What about low beta?

A high value of beta means that the drop of density from inside the surface (where it is alpha) to outside the surface (where it decays to zero) is very gradual. This means that a slight change in the SDF value causes a slight change in the density value. This means that the SDF will be biased to be smoother/thicker/more diffused and cause regions around the "real" surface to be more opaque (have non-zero density).

In contrast, a low value of beta means that the drop of density from inside to outside the surface is very sharp (in fact, as beta tends to zero, the density approaches a step function from alpha inside the surface to zero outside the surface). A low beta will thus bias the SDF to be thinner/more concentrated and cause regions around the "real" surface to be less opaque, because a slight change in the SDF value will lead to a large change in the density value.

  1. Would an SDF be easier to train with volume rendering and low beta or high beta? Why?

An SDF is easier to train with volume rendering with a high beta. This is because the concept of volume rendering relies on sampling multiple points along each ray and combining/weighting their appearances to form the appearance of a single pixel. Thus, with a high beta, a larger number of points have a non-zero density and a larger number of points are used to compute a pixel's value. This means that the gradients are backpropagated through a higher number of points at a time, leading to dense gradients/learning and faster convergence.

Note: Here, I've only reasoned about which SDF would be easier to train -- that is, I've focused on the ease of optimization and not necessarily on which SDF would be more accurate.

  1. Would you be more likely to learn an accurate surface with high beta or low beta? Why?

An accurate surface would be modeled by a low beta, since in the limit of vanishing beta, the density approaches a step function. The optimization may be tricky, but in principle a lower value of beta would encourage a sharp and accurate boundary/surface.

I created a new class NeuralSurfaceWithColor for this question.

python -m a4.main --config-name=volsdf

The results will be saved in images/<cfg.training.checkpoint_path>/part_3_<epoch>.gif and images/<cfg.training.checkpoint_path>/part_3_geometry_<epoch>.gif.

The values of alpha, beta can be controlled by setting their values in the volsdf.yaml config. cfg.neus should be set to False for this question.

Comment on the settings you chose, and why they seem to work well:

For the MLP hyperparameters, I used the same hyperparameter values as for NeRF (for number of hidden layers, number of units in hidden layers for distance and color, etc).

As in NeRF, I added a skip connection feeding in the input point again further in the network when predicting color. I experimented with using the raw coordinate versus using the harmonic encoding of the coordinate in the skip connection. As we know, using harmonic encoding of the coordinates in the input is crucial for learning to represent high-frequency details. However, it seems that in the skip connection, the network can model high-frequency details even with the raw coordinates themselves. This behavior is controlled by cfg.embedding_in_skip which must be set to False for passing raw coordinates in skip connection and vice versa.

Raw coordinates in skip connection Harmonic embedding in skip connection
alt alt
alt alt

I experimented with different values of alpha ($\alpha$) in the density conversion. $\alpha = 10$ seems to give the best representation of geometry as well as texture. $\alpha = 50$ has a crisper image but poor geometry.

$\alpha = 5$ $\alpha = 10$ $\alpha = 50$
alt alt alt
alt alt alt

$\beta = 0.05$ fixed for the above experiments.

I also experimented with different values of beta ($\beta$) in the density conversion. $\beta = 0.05$ seems to give the best representation of geometry as well as texture.

$\beta = 0.05$ $\beta = 0.1$ $\beta = 0.2$
alt alt alt
alt alt alt

$\alpha = 10$ fixed for the above experiments.

4. Neural Surface Extras

4.1 Render a Large Scene with Sphere Tracing

python -m a4.main --config-name=composite

The output will be stored in images/part_1.gif.

I defined a SphereCollectionSDF which consists of cfg.n_spheres number of spheres sampled randomly within a certain interval (min and max specified in the config).

40 spheres rendered in different locations:

alt

alt

100 spheres:

alt

4.2 Fewer Training Views

For this part, uncomment the lines 125-128 in dataset.py. The fewer number of views can be set as num_views on line 125 in dataset.py.

SDF: python -m a4.main --config-name=volsdf

NeRF: python main.py --config-name=nerf_lego

num_views = 10 num_views = 20
alt alt
alt alt
alt alt

First row: NeRF renderings. Second row: VolSDF renderings. Third row: VolSDF geometry.

We see that the surface-based VolSDF can learn using as few as 10 views where the NeRF fails completely.

4.3 Alternate SDF to Density Conversions

I implemented the 'naive' solution from the NeuS paper in the sdf_to_neus_density function inside a4/renderer.py.

Here are the learned SDF and geometry for different values of the hyperparameter s:

$s = 10$ $s = 50$ $s = 100$
alt alt alt
alt alt alt

The image seems better as compared to the VolSDF paper's renderings, but the learned geometry is more error-prone.