<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>扩散模型：DDPM → SDE/ODE → Flow Matching on Xu'Blog</title><link>https://xuquant.com/posts/mathematics/diffusion/</link><description>Recent content in 扩散模型：DDPM → SDE/ODE → Flow Matching on Xu'Blog</description><image><title>Xu'Blog</title><url>https://xuquant.com/og-default.png</url><link>https://xuquant.com/og-default.png</link></image><generator>Hugo -- 0.152.2</generator><language>zh</language><lastBuildDate>Mon, 18 May 2026 09:00:00 +0800</lastBuildDate><atom:link href="https://xuquant.com/posts/mathematics/diffusion/index.xml" rel="self" type="application/rss+xml"/><item><title>为什么大扩散模型不会背诵训练数据：两个时间尺度的隐式正则化</title><link>https://xuquant.com/posts/mathematics/diffusion/why-diffusion-dont-memorize/</link><pubDate>Mon, 18 May 2026 09:00:00 +0800</pubDate><guid>https://xuquant.com/posts/mathematics/diffusion/why-diffusion-dont-memorize/</guid><description>NeurIPS 2025 Best Paper (Bonnaire et al. 2025) 给出了一个干净的回答：扩散模型训练存在两个分离的时间尺度——泛化窗口 τ_gen 和记忆窗口 τ_mem。τ_mem 正比于数据集规模 n（实测斜率约 300K steps per sample），意味着数据集越大，安全训练窗口自动越长。背后机制是神经网络梯度流的 spectral bias：低频 population score 先被学到，高频 empirical score 尖刺要等大量步数才被追上。本文从 Carlini 2023 的实证担忧切入，详解两个时间尺度的实验现象、n-线性标度律的推导、Random Feature 网络的谱分析，以及对训练实践的启示。</description></item><item><title>Flow Matching 与一致性模型：生成范式的新统一</title><link>https://xuquant.com/posts/mathematics/diffusion/flow-matching-consistency/</link><pubDate>Sat, 25 Apr 2026 09:00:00 +0800</pubDate><guid>https://xuquant.com/posts/mathematics/diffusion/flow-matching-consistency/</guid><description>从扩散模型的随机路径到 Flow Matching 的确定性最优传输路径，再到一致性模型的单步蒸馏，建立生成模型 ODE 视角的统一框架。</description></item><item><title>扩散模型的 SDE/ODE 统一：随机微分方程到确定性采样</title><link>https://xuquant.com/posts/mathematics/diffusion/sde-ode-unified/</link><pubDate>Wed, 22 Apr 2026 09:00:00 +0800</pubDate><guid>https://xuquant.com/posts/mathematics/diffusion/sde-ode-unified/</guid><description>从离散马尔可夫链推导连续 SDE 极限，建立概率流 ODE 的严格推导，解释得分函数的几何意义与朗之万动力学的等价性。</description></item><item><title>扩散模型的变分基础：从 ELBO 到去噪</title><link>https://xuquant.com/posts/mathematics/diffusion/ddpm-variational/</link><pubDate>Sat, 18 Apr 2026 09:00:00 +0800</pubDate><guid>https://xuquant.com/posts/mathematics/diffusion/ddpm-variational/</guid><description>从 ELBO 推导 DDPM 的变分下界，解释三项分解的物理意义，证明预测噪声与预测数据的等价性，建立扩散训练的变分理解。</description></item></channel></rss>