Looking at the past year, the biggest innovation in the GAN space might have been the BigGAN architecture.
The mega-architecture, trained on 512 TPUs in the space of two days, also includes a sweatshirt category. I had to look into it – would I be able to get visually pleasing 512x512px results? I was in high-res heaven, but what I found was grey, mainstream and honestly a little boring.
Let’s talk about the upsides first: BigGAN is close to how I imagine modern research needs to be. The claims are big and for the usual human not reproducible. Renting these TPUs would cost you $59,000 so unless you work at Google, you won’t train this on your local machine. What they did is use their Colaboratory together with TF Hub to host a demo – which makes it possible to go through a bunch of imagenet classes even on your phone.
Also: The model comes in 128×128, 256×256 and 512x512px resolution. Whenever I train my networks, my GPU maxes out at close to 256ish – so it’s nice having these high-res results.
I train my networks on a collection of fashion shoots and I try to make the clothing pieces as diverse as possible. The result are colorful, weird, special. ImageNet – the dataset where BigGAN was trained on – usually consists of more mainstream designs, some of them in the wild. The Result? Neutral colors. A centered logo as eye catcher – and some funny faces. Have a look!
That being said, there is an option to interpolate between images (e.g. mix our sweatshirts with animals) that I’m more than excited to try out!