Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
deep-learning optimizer pytorch artificial-intelligence moe resnet vit diffusion mae fairseq cuda-programming bert-model gpt2 transformer-xl timm convnext adan llms dreamfusion llm-training
-
Updated
Jun 8, 2025 - Python