Fine-Tuning Apache-2.0

Axolotl

Config-driven fine-tuning framework supporting LoRA, QLoRA, full fine-tuning, multi-GPU, FSDP2, and DeepSpeed. Simplifies training with YAML configuration.

Platforms: linuxdocker

Axolotl is a configuration-driven fine-tuning framework that simplifies the process of training and adapting large language models. Instead of writing training scripts, users define their entire training configuration in a YAML file — model, dataset, hyperparameters, quantization, and distributed training settings. For ML practitioners who want to fine-tune models without boilerplate code while maintaining access to advanced training features like FSDP2 and multi-GPU parallelism, Axolotl is the most streamlined and capable config-driven option.

Key Features

YAML-driven configuration. Define your entire fine-tuning job in a single YAML file. Axolotl handles model loading, data preprocessing, training loop setup, checkpointing, and evaluation based on the configuration. No training scripts to write or maintain.

Training method flexibility. Axolotl supports full fine-tuning, LoRA, QLoRA (4-bit quantized LoRA), ReLoRA, and adapter-based methods. Switch between methods by changing a few lines in the config file, making it easy to experiment with different approaches.

Multi-GPU and distributed training. Native support for FSDP (Fully Sharded Data Parallel), FSDP2, and DeepSpeed enables training across multiple GPUs and nodes. Axolotl handles the distributed training complexity, letting users scale by adding GPU resources without rewriting configurations.

Dataset flexibility. Axolotl supports diverse dataset formats: Alpaca, ShareGPT, chat templates, completion-only, and custom formats with field mapping. Multiple datasets can be combined and weighted within a single training run. Preprocessing handles tokenization and packing automatically.

Model architecture support. Compatible with a wide range of model architectures from Hugging Face, including Llama, Mistral, Qwen, Phi, Gemma, and others. Support for multimodal fine-tuning is available for vision-language models.

Integration with Hugging Face. Axolotl builds on the Hugging Face Transformers and PEFT libraries, inheriting their model support and community. Trained models and adapters push directly to Hugging Face Hub for sharing and deployment.

When to Use Axolotl

Choose Axolotl when you want a battle-tested, config-driven approach to fine-tuning that scales from single-GPU LoRA to multi-node distributed training. It is ideal for ML practitioners who fine-tune regularly and want reproducible, version-controlled training configurations.

Ecosystem Role

Axolotl sits alongside Unsloth and LLaMA Factory in the fine-tuning toolkit. It is more config-driven than LLaMA Factory (which favors a web UI) and more feature-rich for distributed training than Unsloth (which focuses on speed). For single-GPU speed, Unsloth is faster. For multi-GPU distributed training with maximum flexibility, Axolotl is the stronger choice.