What It Is
Ordering training data from simple to complex, so the model learns foundational concepts before advanced ones.
How It Accelerates Training
- Model doesn't waste compute on complex examples it can't understand yet
- Gradients are more stable in early training
- Research shows 20-50% training time reduction for some tasks
Implementation for LARS
- Sort datasets by complexity:
- Short prompts/responses first
- Single-concept examples before multi-concept
-
Nexus basics before advanced workflows
-
Stage the training:
- Stage 1: Identity (Who is LARS?)
- Stage 2: Basic tasks (simple commands)
- Stage 3: Complex reasoning (3D dataset)
-
Stage 4: Multi-step workflows
-
Use difficulty scoring:
- Count tokens, nested concepts, required context
- Auto-sort datasets by difficulty score
Research References
- Bengio et al. 'Curriculum Learning' (2009)
- Self-paced learning variants
- Competence-based curriculum (measure mastery before advancing)