- Data augmentation- Crop patches from images in batch
- Add colour jitter
 
- Within batch sample positive and negative- Patches from same image are positive
- All other negative
 
- MLP layer to compute loss instead of bottleneck embedding- Head network for function of bottleneck
 
