Homework Number Four

16-889 Learning for 3D Vision
Ben Kolligs

Andrew ID: bkolligs

zero late days
Zero Late Days Used

Question 1

Here is the torus after my sphere tracing was completed. The specific algorithm for sphere tracing is: $$ \begin{align} &p\gets x_0 \\ &d\gets f(p)\\ &t \gets 0 \\ &\textrm{for } i=1:N \textrm{ do} \\ &\qquad t \gets t + d\\ &\qquad p \gets x_0 + t*d\\ &\qquad d \gets f(p) \\ &m \gets d < \epsilon \end{align} $$ For points $p$, mask $m$, signed distances $d$, ray origins $x_0$, max iterations $N$, and step parameter $t$. Note that instead of using max iterations, you can use the convergence threshold $\epsilon$.

Sampled points

Question 2

For this question I used similar architecture as last homework. The goal here is to output distances for each point.

  1. Harmonic Embedding $3 \rightarrow 39$
  2. Linear $ 39 \rightarrow 128$ and ReLU
  3. Linear $ 128 \rightarrow 128$ and ReLU
  4. Linear $ 128 \rightarrow 128$ and ReLU
  5. Linear $ 128 \rightarrow 128$ and ReLU
  6. Linear $ 128 \rightarrow 128$ and ReLU
  7. Linear $ 128 \rightarrow 1$
Sampled points
My Eikonal loss took the form of the loss described in the paper, using Mean squared error difference between the gradient and 1: $$ \mathcal{L} = \frac{1}{N}\sum_i^N \left[ \| \nabla f_i \| - 1 \right]^2 $$

Question 3

In order to render the volume, we need to go from the surface to density representation. I implemented the formula in the VolSDF paper. $$ \begin{align} \sigma(x) &= \alpha \Psi_\beta(-d_\Omega(x)) \\ \Psi_\beta(s) &= \begin{cases} \frac{1}{2} \exp(\frac{s}{\beta}) & \textrm{if } s \leq 0 \\ 1 - \frac{1}{2} \exp(-\frac{s}{\beta}) & \textrm{if } s > 0 \end{cases} \end{align} $$ For signed distance function $d_\Omega$, cumulative distrubituion function $\Psi_\beta$. This formula has two learnable parameters, $\alpha, \beta$. $\alpha$ describes the scale of the CDF as an approximation of density. Keep in mind that since we are negating the raw signed distance value, when $s > 0$ this corresponds to a point within the object. $\beta$ describes the smoothing near the object's boundary. A high $\beta$ value is going to make the SDF "blurrier" by smoothing the transition between object boundaries. A low $\beta$ means that the density converges to a "scaled indicator function" of $d_\Omega$, which basically means that the exponent gets large for distances outside the object, and approaches $0.5$ for anything inside the object. An SDF would be easier to train with a higher beta, because the network would be penalized less for getting the boundaries between objects more "smeared". The SDF would be more accurate with low beta, since the variance of the boundary is much smaller. For the color network, I use the same architecture as the distance one, but with a different output dimension of $3$ and a Sigmoid activation function.

  1. Harmonic Embedding $3 \rightarrow 39$
  2. Linear $ 39 \rightarrow 128$ and ReLU
  3. Linear $ 128 \rightarrow 128$ and ReLU
  4. Linear $ 128 \rightarrow 128$ and ReLU
  5. Linear $ 128 \rightarrow 128$ and ReLU
  6. Linear $ 128 \rightarrow 128$ and ReLU
  7. Linear $ 128 \rightarrow 3$ and Sigmoid
  8. Sampled points Sampled points

Question 4

Part 1

For this question, I decided to learn how to render multiple SDFs in one scene. I enjoyed this question, because SDFs offer a very nice way to render objects. The following scene has a total of 158 objects inside of it:

  • 12 toruses
  • 144 spheres
  • 2 "links"
Sampled points

Part 2

I also wanted to try training the network on fewer images. This is the result of using 20 images, as the prompt suggests:

Sampled points Sampled points
We can see that the gemoetry is still pretty good, but the color has a lot of artifacts. The artifacts do have structure though as we rotate the image. It's possible the network has learned some pixels that are outside of the object mask?

Part 3

For the last part of question 4 I decided to use the "naive" implementation of density from the NeuS paper, $$ \sigma_s(x) = \frac{se^{ -sx }}{(1 + e^{ -sx })^2}. $$ The $s$ here is inversely proportional to the standard deviation of this logistic density distribution. The paper says that any unimodal density distribution centered at 0 should work.

Sampled points Sampled points
We can see an interesting "reflection" artifact underneath the model, and the SDF itself seems to have been corrupted. Certainly not as good as the VolSDF paper density.