Daniel Bronstein

Assignment #4

3D Learning

0 late days used

1


Sphere tracing was implemented in a vectorized form, where the ray from the origins along the directions was iteratively marched a distance equivalent to value of the SDF along the ray. For a well-behaved, proper SDF, this gurantees that the ray is never marched inside of an object (assuming it did not start in one). 3 conditions were checked for sphere trace termination: If the value of the SDF achieved some epsilon of 0, of the distance marched along the ray exceeded the rendering maxmimum limit, or the maximum number of iterations was reached. Rays that reached an object within some epsilon were marked on the mask indicating an object intersection.

2


The MLP used in this section closely resembles that used for NeRF. Harmonic coefficients of the input coordinate are used to capture higher frequency details. These are fed through 6 fully connected layers with 128 neurons each, before outputting a single value representing the SDF value at the input location.
The Eikonal loss is used to guide the network to learn a true SDF. In a true SDF, the magnitude of the gradient is 1 everwhere. Thus, calculating the gradient of the estimated SDF and adding a MSE loss term driving it towards 1 is an effective way of regularizing the network towards a true SDF.

3

Estimating density from an SDF is possible through a transform parameterized over α and β. Here β represents the spread of the CDF used to model the density at the surface of the object. α represents the highest value this density can take. Intuitively these parameters model the density of true objects in the scene (α), and the smoothness of the density value in the regions connecting them to free-space (β).

Higher values of β lead to more sprawling learned geometries and incorret hole closures, while lower values are more tightly constrained against the actual object boundary and can exhibit more fine details in the geometry.

Higher values of beta would be easier to train since the edges of objects in the scene are more 'forgiving'. At at value of β=0, the density function derived from the SDF is an indicator function for the objects location, thus the SDF would need to be exactly correct to line up with the ground truth. However, with a strong estimate of the SDF, lower values of β will yield more faithful reconstructions.

α = 10, β = 0.05

α = 5, β = 0.1


The values α = 10, β = 0.05 yielded the best results. These were selected because they were the default values in the training configuration and worked well. The geometry captures many scene details without too much sprawling, and the training converged well.

4a