1 late day used. Note one "bells and whistles" using the differentiable augmentation library in part 1

Implement Data Augmentation

Implement the Discriminator

From pytorch docs ( https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html)

layers 1-4

h = ⌊(𝑛ℎ+2×𝑝ℎ−dh×(𝑘ℎ-1)-1)/𝑠ℎ+1⌋

32 = ⌊(64+2×𝑝ℎ−1×(4-1)-1)/2+1⌋

32 = ⌊(64+2×𝑝ℎ−3-1)/2+1⌋

32 = ⌊(60+2×𝑝ℎ)/2+1⌋

31 = (60+2×𝑝ℎ)/2

62 = (60+2×𝑝ℎ)

2 = 2×𝑝ℎ

1 = 𝑝ℎ

layer 5

h = ⌊(𝑛ℎ+2×𝑝ℎ−dh×(𝑘ℎ-1)-1)/𝑠ℎ+1⌋

1 = ⌊(4+2×𝑝ℎ−1×(4-1)-1)/2+1⌋

1 = ⌊(4+2×𝑝ℎ−3-1)/2+1⌋

1 = ⌊(0+2×𝑝ℎ)/2+1⌋

0 = (0+2×𝑝ℎ)/2

0 = (0+2×𝑝ℎ)

0 = 𝑝ℎ

Implement Generator

From https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html Hout=(Hin−1)×stride−2×padding+dilation×(kernel_size−1)+output_padding+1

layer 1

4 = (1-1)×2-2×padding+1×3+0+1

4 = 0-2×padding+1×3+0+1

4 = 2×padding+4

0 = 2×padding

0 = padding

layers 2-5

8 = (4-1)×2-2×padding+1×3+0+1

8 = 6-2×padding+1×3+0+1

8 = 6-2×padding+3+1

-2 = -2×padding

1 = padding

Basic Augmentation

Deluxe Augmentation

Deluxe + Diff Data Aug

DCGAN Comments

  1. What should the training curve look like?

The loss on the discriminator should start high, from random weights. It should then get lower, as it learns to discriminate. Eventually, it should increase or stabilize as the generator gets better. The loss on the generator should start high, from random weights. It will be unstable, especially at first. Eventually, it may stabilize and enter a stalemate with the discriminator. With differentiable augmentation, we hope that both losses will be more stable

  1. Do these curves exhibit this behavior?

It is difficult to tell, do to the instability. However, the differentiable data aug curve is much more stable (It also seems to take longer to reach a minimum)

  1. What do you notice about the samples?

Due to the instability, samples without differentiable data augmentation might be really bad, even after training has converged. Initial samples seem to focus on getting colors into the correct place. Afterwards, details start to appear. For instance, the nose is basically missing on early iterations

Cycle GAN

Training without cycle consistency

Training with cycle consistency

  1. Do you see a difference?

The predicted russian blue looks a lot better with cycle consistency. The predicted grumpy cat seems to be more interesting/less constant with cycle consistency.

  1. Why? For the first observation, the russian blue looks better because cycle consistency allows us to get more information from grumpy cat images in the training of the grumpy to russian mapping. This helps since we have so many fewer examples total. For the second observation, the cycle consistency requires us to be able to represent the greater pose variety in the russian blue images with the created grumpy cat images. This results in generated grumpys with matching pose variety.