<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Cross-Entropy on Xu'Blog</title><link>https://xuquant.com/tags/cross-entropy/</link><description>Recent content in Cross-Entropy on Xu'Blog</description><image><title>Xu'Blog</title><url>https://xuquant.com/og-default.png</url><link>https://xuquant.com/og-default.png</link></image><generator>Hugo -- 0.152.2</generator><language>zh</language><lastBuildDate>Thu, 28 May 2026 08:00:00 +0800</lastBuildDate><atom:link href="https://xuquant.com/tags/cross-entropy/index.xml" rel="self" type="application/rss+xml"/><item><title>深入理解 KL 散度：四个视角</title><link>https://xuquant.com/posts/mathematics/probability/kl-divergence-four-views/</link><pubDate>Thu, 28 May 2026 08:00:00 +0800</pubDate><guid>https://xuquant.com/posts/mathematics/probability/kl-divergence-four-views/</guid><description>KL 散度在 ML 里到处出现——cross-entropy / ELBO / Information Bottleneck / RLHF / SAC——但它的&amp;#39;为什么是这一坨&amp;#39;容易卡在公式层面。本文从 coding length、似然比、信息几何（Bregman）、mode-seeking vs mass-covering 四个互补视角拆 KL，每个视角解释它的一个性质。最后把这四个视角挂回 cross-entropy / ELBO / IB / SAC / RLHF 几个具体应用，看每个用了哪个视角的语言。</description></item><item><title>熵与信息论：从 -log p 到深度学习</title><link>https://xuquant.com/posts/mathematics/probability/entropy-and-information/</link><pubDate>Mon, 25 May 2026 20:00:00 +0800</pubDate><guid>https://xuquant.com/posts/mathematics/probability/entropy-and-information/</guid><description>从公理化角度推出 -log p 的必然性，依次过熵、互信息、KL 散度、最大熵原理，再回到深度学习里反复出现的几种形态——交叉熵损失、ELBO、信息瓶颈、最大熵强化学习。</description></item></channel></rss>