LLM on Xu'Blog

LLM on Xu'Bloghttps://xuquant.com/tags/llm/Recent content in LLM on Xu'BlogXu'Bloghttps://xuquant.com/images/profile.jpghttps://xuquant.com/images/profile.jpgHugo -- 0.152.2enWed, 29 Apr 2026 14:00:00 +0800Qwen3.5 vs Qwen3: A Deep Architectural Comparisonhttps://xuquant.com/posts/autodrive/qwen3-vs-qwen3-5-architecture/Wed, 29 Apr 2026 14:00:00 +0800https://xuquant.com/posts/autodrive/qwen3-vs-qwen3-5-architecture/A deep architectural comparison of Qwen3.5 versus Qwen3, examining hybrid attention, native multimodal fusion, high-sparsity MoE, and partial RoPE across attention, vision, and MoE dimensionsCORAL: Autonomous Multi-Agent Evolution for Open-Ended Discoveryhttps://xuquant.com/posts/ai/coral-autonomous-multi-agent-evolution/Thu, 15 May 2025 10:00:00 +0800https://xuquant.com/posts/ai/coral-autonomous-multi-agent-evolution/How delegating evolutionary search decisions to autonomous agents—rather than relying on fixed heuristics—enables faster convergence and stronger results across mathematical and systems optimization tasks.Multi-Head Latent Attention: Efficient KV Cache Compression in DeepSeek-V2https://xuquant.com/posts/autodrive/deepseek_series1_mla/Sat, 15 Feb 2025 10:00:00 +0800https://xuquant.com/posts/autodrive/deepseek_series1_mla/Deep technical analysis of Multi-Head Latent Attention (MLA) from DeepSeek-V2, covering low-rank KV cache compression, decoupled RoPE design, and computational cost comparison with MHA, MQA, and GQA.