page

Curriculum Learning

What It Is

Ordering training data from simple to complex, so the model learns foundational concepts before advanced ones.

How It Accelerates Training

Model doesn't waste compute on complex examples it can't understand yet
Gradients are more stable in early training
Research shows 20-50% training time reduction for some tasks

Implementation for LARS

Sort datasets by complexity:
Short prompts/responses first
Single-concept examples before multi-concept
Nexus basics before advanced workflows
Stage the training:
Stage 1: Identity (Who is LARS?)
Stage 2: Basic tasks (simple commands)
Stage 3: Complex reasoning (3D dataset)
Stage 4: Multi-step workflows
Use difficulty scoring:
Count tokens, nested concepts, required context
Auto-sort datasets by difficulty score

Research References

Bengio et al. 'Curriculum Learning' (2009)
Self-paced learning variants
Competence-based curriculum (measure mastery before advancing)

🌳 View Tree