To implementation sphere tracing, I used iterative method for the ray marching, here's the pesudo code:
While(f(p)>epsilon): t = t + f(p) p = origin + t*directions
The reason why we never cross the surface is because the distance between point p to hit surface is always larger than the distance between point p to clostest surface, unless the ray happen to be normal of hitting point surface which is very rare.
To compute the mask, I used the fact that the ray always proceed and will go to distance larger than our far
threshold ultimately if it does not hit the surface. So once the marched distance t
larger than this threshold, I consider it as False in the mask.
The input to the neural network was point clouds sampled on the object, and outputs signed difference with respect to the object. Therefore, GT of the output should just be zeros. During implementation, I used a 6 hidden MLP layers with 128 neurons for each.
Eikonal loss was to enforce gradient of SDF respect to x to have norm of 1 as a regulation term, mathematically speaking:
The loss encourages the coordinate-based network to predict the closest offset to surface instead of arbitary values for points in space.
Point cloud used for training:
Prediction of my network:
Signed difference function to density:
intuitive explanation of what the parameters alpha
and beta
are doing.
alpha
controls the overall density, it is a scale term in front of the function;
beta
controls the smoothing amount, meaning how density is sensitive to distance changes, if beta
approaches zero, the density close to the surface would change dramatically.
How does high beta
bias your learned SDF? What about low beta
?
High beta
results density less sensitive near surface and the rendering would be more blurred. On the contrary, low beta
results density highly sensitive near surface such that the surface presentation is more accurately extracted.
Would an SDF be easier to train with volume rendering and low beta
or high beta
? Why?
There's no single direction: Right beta
would make SDF easier to train. Generally, high beta would make training and volume rendering more stable, as sharp transition from empty to occupied space would be mitigated. However, the exact surface would be well-extracted and optimized. On the other hand, low beta
results too small margin for surface have nonzero density, so the volume rendering and training may be harder. So there's some better beta
lying between.
Would you be more likely to learn an accurate surface with high beta
or low beta
? Why?
An accurate surface would be more likely to be learning by a low beta
. Because the density function in this case converges to a scaled indicator function, as density close to surface would be
Model rendered with best parameters I chose:
During my implementation, I randomly chose 20 images for training, VolSDF was trainable and showed fair results:
As a comparison, using same set of training samples, Nerf was not able to converge and resulted empty:
In order to check the minimum pictures need for nerf to generate reasonable results, I gradually increased the samples to 85 to get the renderings under the same set of parameter settings.
We can conclude that VolSDF needs less training views than nerf. The reason behind this could be VolSDF is to train the implicit SDF, and then density is calculated afterwards. The information that needed to learn is much less.
In Q3, we used the equations from VolSDF Paper to convert SDF to density. You should try and compare alternate ways of doing this e.g. the ‘naive’ solution from the NeuS paper, or any other ways that you might want to propose!
In this section, I used the 'naive' solution from the NeuS paper:
Implementation is in sdf_to_density_naive
, and I chose scale parameter s=150
which gives me results: