10
Mar
2026
Multi-Node Fine-Tuning: FSDP Sharding Strategy Matters
Introduction In a previous post, I ran Qwen2.5-72B inference on Azure H100 nodes and showed how NVLink’s 900 GB/s ba...