Scaling-Law on Xu'Blog

Scaling-Law on Xu'Bloghttps://xuquant.com/tags/scaling-law/Recent content in Scaling-Law on Xu'BlogXu'Bloghttps://xuquant.com/og-default.pnghttps://xuquant.com/og-default.pngHugo -- 0.152.2zhSun, 07 Jun 2026 10:00:00 +0800训练大模型的 Scaling Law：科学、工程与边界https://xuquant.com/posts/foundation-models/chinchilla-and-modern-llm-training/Sun, 07 Jun 2026 10:00:00 +0800https://xuquant.com/posts/foundation-models/chinchilla-and-modern-llm-training/Scaling law 是一套被研究透的科学，也是一个被行业系统性偏离的指南。从 Hestness 2017 的早期实证到 Rosenfeld 2020 的闭式解、Kaplan / Chinchilla 的拉锯、Besiroglu 2024 的复刻批判，再到训练栈的 4D 并行、FP8、Post-training RL，把 scaling law 的科学结论与工程演化串成一条主线。附 D3 拟合脆弱性 playground。