Description
🚀 Feature
Note: To track the progress of the project check out this board.
Add popular primitives (Losses, Schedulers, Data Augmentations, Operators etc) which are often used to reproduce SOTA references and new popular highly accurate models with pre-trained weights to TorchVision.
Motivation
Though TorchVision currently includes many common building blocks necessary for training CV models, it currently lacks popular primitives which are often used to reproduce SOTA. Some of these primitives are part of our reference scripts (Data utils, transforms etc) because previously did not want to commit to a specific API. Others are part of libraries from the broader ecosystem. Additionally, it does not provide some of the newer, popular architectures which currently achieve good results in a variety of vision tasks.
Adding support of such primitives and models to TorchVision will give a “batteries included” experience to its users. Researchers will be able to do SOTA research and reproduce papers by using common building blocks rather than rewriting their own while industry users will be able to adapt easier the models in their domains using SOTA techniques.
Pitch
The addition of primitives should be done in several phases, iterating between trying to reproduce SOTA recipes, identifying accuracy gaps and implementing the necessary methods to close them. The progress of this project is tracked on this board.
During phase 1, add to TorchVision the following primitives and models:
- Losses [RFC] Loss Functions in Torchvision #2980:
- Schedulers:
- ChainedScheduler To add Chained Scheduler to the list of PyTorch schedulers. pytorch#63491 To fix the chainability at epoch zero for some schedulers pytorch#63457 To add state dict and load_dict for Chained Scheduler pytorch#65034
- ConstantLR and LinearLR for warmup To change WarmUp Scheduler with ConstantLR and LinearLR pytorch#64395
- SequentialLR To add SequentialLR to PyTorch Core Schedulers pytorch#64037 To add state_dict and load_state_dict to SequentialLR pytorch#65035
- Models Are new models planned to be added? #2707:
- Data Augmentations [RFC] New Augmentation techniques in Torchvison #3817:
- Operators:
- Training Recipes:
- EMA model support Added Exponential Moving Average support to classification reference script #4381 Added update_parameters to EMA to fix calculation #4406 Update the metrics output on reference scripts #4408
- Updated reference scripts Adding label smoothing on classification reference #4335 Warmup schedulers in References #4411 Further enhance Classification Reference #4444 Additional SOTA ingredients on Classification Recipe #4493
Other potential primitives to be considered during phase 2:
- Barron loss see classy_vision.
- Augmix + JSD loss
- FastAutoAugment
- Large Scale Jitter
- AutoDropout Layer
- DropBlock Layer
- DropConnect Layer
- ShakeDrop Layer
- Random Noise LR Scheduler
Note that any of the suggested primitives that are not vision-specific should be added on PyTorch, so that all Domain libraries can benefit from them.