Name: Meghana Reddy Ganesina
Andrew ID: mganesin
Command to run: python main.py --config-name=box
Grid Visualization | Ray Visualization |
---|---|
![]() |
![]() |
Command to run: python main.py --config-name=box
Sample points |
---|
![]() |
Command to run: python main.py --config-name=box
Color Map | Depth Map |
---|---|
![]() |
![]() |
Code:
sample_coordinates = torch.randint(0, xy_grid.shape[0], (n_pixels,))
xy_grid_sub = xy_grid[sample_coordinates,:]
Code:
loss = torch.mean(torch.square(rgb_gt - out['feature']))
Box center: (0.25, 0.25, 0.00)
Box side lengths: (2.00, 1.50, 1.50)
Command to run: python main.py --config-name=train_box
Spiral Sequence of optimized Volume |
---|
![]() |
Command to run: python main.py --config-name=nerf_lego
I have used the same architecture as the Nerf paper mentions.
Architecture:
# MLP head for density
class DensityNet(torch.nn.Module):
def __init__(self, cfg):
super().__init__()
self.model = torch.nn.Sequential(
torch.nn.Linear(cfg.n_hidden_neurons_xyz,1),
torch.nn.ReLU()
)
def forward(self, x):
x = self.model(x)
return x
# MLP head for Color
class RgbNet(torch.nn.Module):
def __init__(self, cfg, direction, embedding_dim_dir):
super().__init__()
self.direction = direction
layers = []
if self.direction:
linear = torch.nn.Linear(cfg.n_hidden_neurons_xyz + embedding_dim_dir,
cfg.n_hidden_neurons_dir)
else:
linear = torch.nn.Linear(cfg.n_hidden_neurons_xyz,
cfg.n_hidden_neurons_dir)
layers.append(torch.nn.Sequential(linear, torch.nn.ReLU(True)))
linear = torch.nn.Linear(cfg.n_hidden_neurons_dir,3)
layers.append(torch.nn.Sequential(linear, torch.nn.Sigmoid()))
self.model = torch.nn.ModuleList(layers)
def forward(self, x, d):
y = x
for i, layer in enumerate(self.model):
if i == 0 and self.direction:
y = torch.cat((y,d), dim = -1)
y = layer(y)
return y
class NerfModel(torch.nn.Module):
def __init__(self, cfg, input_dim, input_dir_dim):
super().__init__()
self.direction = cfg.direction
self.input_dim = input_dim
self.embedding_dim_dir = input_dir_dim
self.mlp = MLPWithInputSkips(n_layers=cfg.n_layers_xyz,
input_dim=self.input_dim,
output_dim=cfg.n_hidden_neurons_xyz,
skip_dim=self.input_dim,
hidden_dim=cfg.n_hidden_neurons_xyz, input_skips=cfg.append_xyz)
self.fc1 = torch.nn.Linear(cfg.n_hidden_neurons_xyz,cfg.n_hidden_neurons_xyz )
self.densitynet = DensityNet()
self.rgbnet = RgbNet(cfg, self.direction, self.embedding_dim_dir)
def forward(self, x, d):
x = self.mlp(x,x)
x = self.fc1(x)
density = self.densitynet(x)
feature = self.rgbnet(x, d)
out = {'density': density,
'feature': feature}
return out
Default config file given in the assignment has been used as it is except the following parameters were changed:
n_layers_xyz: 8
append_xyz: [4]
n_hidden_neurons_xyz: 256
n_hidden_neurons_dir: 128
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Command to run: python main.py --config-name=nerf_lego_direction
Tradeoffs between increased view dependence and generalization quality:
I have followed the NERF paper to implement view dependence. As the paper mentions, view dependence helps in rendering the specularities. On increasing the view dependence the model will overfit on certain views (the seen views will have specularities as groundtruth) and render unexplainable unseen views. Therefore, on increasing view dependence the generalization quality decreases.
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Command to run: python main.py --config-name=nerf_lego_highres
The following results were obtained by tweaking the n_pts_per_ray parameter to 32, 64, 128 and 256. We can observe that as the n_pts_per_ray increase then the rendering has more high frequency details and less smoothened appearance.
a) n_pts_per_ray = 32
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
b) n_pts_per_ray = 64
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
c) n_pts_per_ray = 128
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
d) n_pts_per_ray = 256
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
The following results were obtained by tweaking the n_layers_xyz parameter to 3, 5 and 8. We can observe that the model learns to render the lego with good resolution within 50 epochs for all the cases. There is no noticeable difference across different n_layares_xyz values.
a) n_layers_xyz = 3 and append_xyz = 2
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
b) n_layers_xyz = 5 and append_xyz = 3
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
c) n_layers_xyz = 8 and append_xyz = 4
Epoch 10 | Epoch 50 | Epoch 100 | Epoch 250 |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |