- Data augmentation
- Crop patches from images in batch
- Add colour jitter
- Within batch sample positive and negative
- Patches from same image are positive
- All other negative
- MLP layer to compute loss instead of bottleneck embedding
- Head network for function of bottleneck