Hardware Scaling Path
Current State
2x RTX 3090 = 48GB VRAM - Can train 7B models easily ✓ - Can probably train 14B with 4-bit quantization - 30B would be tight - Location: AI Server (100.89.34.86 / 10.0.0.25)
Next Step (Immediate Need)
3x RTX 3090 = 72GB VRAM - 30B models comfortably - Can run inference + training simultaneously - Room to experiment with larger architectures - Enables sharding across 3 GPUs - This unlocks the next level of LARS development
Future State
3x RTX 6000 Pro = 288GB VRAM (3 × 96GB) - 70B+ models with ease - Multiple models running simultaneously - Production-grade training at scale - Client training pipelines - Full trainer-that-trains-trainers capability
Scaling Benefits
| Config | VRAM | Max Model | Capabilities |
|---|---|---|---|
| 2x 3090 | 48GB | 14B | Basic training, inference |
| 3x 3090 | 72GB | 30B | Simultaneous train+infer |
| 3x 6000 Pro | 288GB | 70B+ | Production training system |
Why Hardware Matters
The 3D training methodology works at any scale. But: - Larger models = better reasoning - More VRAM = larger context windows - Multiple GPUs = parallel operations
The third 3090 is the key to unlocking 30B models and proving the concept scales before investing in 6000 Pros.