Jianchun Chen jianchuc@andrw.cmu.edu
Box center: 0.25, 0.25, 0.00
Box side lengths: 2.01, 1.50, 1.50
NeRF w/o ray direction embedding, w/ 2,4,8,32 different frequencies of ray direction embeddings are shown below from left to right.
Idealy, more directional embedding helps predict the ray dependent color prediction, such as the specular light. However, with too many dimensions of embedding, the network would overfit the input images and fail to generalize in novel view. However, in practice I don't see a major difference with 0,2,4,8,32 dimension of ray embedding.
I render the high-res lego below. The first gif uses default parameter, the second has 4 more layer of MLPs to encode xyz position, and the third has 80 v.s. 128 number of sampled points per ray.
Overall, adding layers improves the expression ability of the model, but a neural net that is too deep breaks the smoothness of the implicit function. The second image looks more unsmooth. Training with more sampled points also helps, but it requires significant larger GPU memory in the inference time (e.g. 256 points for 400*400 image). And sampling 80 points per ray in image 3 gives a fair rendering result, while sampling 64 points the training will fail.