Alternatives to AdamW
Options that appear in the same research papers as AdamW, by co-occurrence.
| Alternative | Papers (with AdamW) | Avg viability |
|---|---|---|
| PyTorch | 2 | — |
| Reinforcement Learning | 1 | — |
| Llama | 1 | — |
| LLM | 1 | — |
| CNN | 1 | — |
| Transformer | 1 | — |
| Group Relative Policy Optimization | 1 | — |
| reinforcement learning | 1 | — |
| Group Relative Policy Optimization (GRPO) | 1 | — |
| AdamW optimizer | 1 | — |
| SGD | 1 | — |
| Gradient Regularization | 1 | — |
| Natural Gradient Descent | 1 | — |
| Regularized-Kalman | 1 | — |
| K-FAC | 1 | — |
| Sophia | 1 | — |
| Gradient Regularized Natural Gradients | 1 | — |
| multi-head self-attention | 1 | — |
| GELU | 1 | — |
| Group Normalization | 1 | — |